summaryrefslogtreecommitdiffstats
path: root/drivers/net/ethernet/mellanox
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds2018-12-281-5/+3Star
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull rdma updates from Jason Gunthorpe: "This has been a fairly typical cycle, with the usual sorts of driver updates. Several series continue to come through which improve and modernize various parts of the core code, and we finally are starting to get the uAPI command interface cleaned up. - Various driver fixes for bnxt_re, cxgb3/4, hfi1, hns, i40iw, mlx4, mlx5, qib, rxe, usnic - Rework the entire syscall flow for uverbs to be able to run over ioctl(). Finally getting past the historic bad choice to use write() for command execution - More functional coverage with the mlx5 'devx' user API - Start of the HFI1 series for 'TID RDMA' - SRQ support in the hns driver - Support for new IBTA defined 2x lane widths - A big series to consolidate all the driver function pointers into a big struct and have drivers provide a 'static const' version of the struct instead of open coding initialization - New 'advise_mr' uAPI to control device caching/loading of page tables - Support for inline data in SRPT - Modernize how umad uses the driver core and creates cdev's and sysfs files - First steps toward removing 'uobject' from the view of the drivers" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (193 commits) RDMA/srpt: Use kmem_cache_free() instead of kfree() RDMA/mlx5: Signedness bug in UVERBS_HANDLER() IB/uverbs: Signedness bug in UVERBS_HANDLER() IB/mlx5: Allocate the per-port Q counter shared when DEVX is supported IB/umad: Start using dev_groups of class IB/umad: Use class_groups and let core create class file IB/umad: Refactor code to use cdev_device_add() IB/umad: Avoid destroying device while it is accessed IB/umad: Simplify and avoid dynamic allocation of class IB/mlx5: Fix wrong error unwind IB/mlx4: Remove set but not used variable 'pd' RDMA/iwcm: Don't copy past the end of dev_name() string IB/mlx5: Fix long EEH recover time with NVMe offloads IB/mlx5: Simplify netdev unbinding IB/core: Move query port to ioctl RDMA/nldev: Expose port_cap_flags2 IB/core: uverbs copy to struct or zero helper IB/rxe: Reuse code which sets port state IB/rxe: Make counters thread safe IB/mlx5: Use the correct commands for UMEM and UCTX allocation ...
| * Merge branch 'mlx5-next' into rdma.gitJason Gunthorpe2018-12-2019-146/+238
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux mlx5 updates taken for dependencies on following patches. * branche 'mlx5-next': (23 commits) IB/mlx5: Introduce uid as part of alloc/dealloc transport domain net/mlx5: Add shared Q counter bits net/mlx5: Continue driver initialization despite debugfs failure net/mlx5: Fold the modify lag code into function net/mlx5: Add lag affinity info to log net/mlx5: Split the activate lag function into two routines net/mlx5: E-Switch, Introduce flow counter affinity IB/mlx5: Unify e-switch representors load approach between uplink and VFs net/mlx5: Use lowercase 'X' for hex values net/mlx5: Remove duplicated include from eswitch.c net/mlx5: Remove the get protocol device interface entry net/mlx5: Support extended destination format in flow steering command net/mlx5: E-Switch, Change vhca id valid bool field to bit flag net/mlx5: Introduce extended destination fields net/mlx5: Revise gre and nvgre key formats net/mlx5: Add monitor commands layout and event data net/mlx5: Add support for plugged-disabled cable status in PME net/mlx5: Add support for PCIe power slot exceeded error in PME net/mlx5: Rework handling of port module events net/mlx5: Move flow counters data structures from flow steering header ...
| | * net/mlx5: Continue driver initialization despite debugfs failureLeon Romanovsky2018-12-171-5/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The failure to create debugfs entry is unpleasant event, but not enough to abort drier initialization. Align the mlx5_core code to debugfs design and continue execution whenever debugfs_create_dir() successes or not. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | Merge tag 'v4.20-rc6' into rdma.git for-nextJason Gunthorpe2018-12-1124-108/+148
| |\ \ | | | | | | | | | | | | For dependencies in following patches.
* | | | net/mlx4_core: drop useless LIST_HEADJulia Lawall2018-12-241-5/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drop LIST_HEAD where the variable it declares has never been used. The semantic patch that fixes this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ identifier x; @@ - LIST_HEAD(x); ... when != x // </smpl> Fixes: c82e9aa0a8bc ("mlx4_core: resource tracking for HCA resources used by guests") Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | mlxsw: spectrum: drop useless LIST_HEADJulia Lawall2018-12-241-1/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drop LIST_HEAD where the variable it declares is never used. The uses were removed in 244cd96adb5f ("net_sched: remove list_head from tc_action"), but not the declaration. The semantic patch that fixes this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ identifier x; @@ - LIST_HEAD(x); ... when != x // </smpl> Fixes: 244cd96adb5f ("net_sched: remove list_head from tc_action") Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | net/mlx5e: drop useless LIST_HEADJulia Lawall2018-12-241-3/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drop LIST_HEAD where the variable it declares is never used. These became useless in 244cd96adb5f ("net_sched: remove list_head from tc_action") The semantic patch that fixes this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ identifier x; @@ - LIST_HEAD(x); ... when != x // </smpl> Fixes: 244cd96adb5f ("net_sched: remove list_head from tc_action") Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | net/mlx5e: fix semicolon.cocci warningskbuild test robot2018-12-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | drivers/net/ethernet/mellanox/mlx5/core/en_rep.c:1339:57-58: Unneeded semicolon Remove unneeded semicolon. Generated by: scripts/coccinelle/misc/semicolon.cocci Fixes: 4c8fb2986d44 ("net/mlx5e: Increase VF representors' SQ size to 128") CC: Gavi Teitz <gavi@mellanox.com> Signed-off-by: kbuild test robot <fengguang.wu@intel.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | net/mlx5e: XDP, Add user control for XDP TX MPWQE featureTariq Toukan2018-12-213-1/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add ethtool private flag 'xdp_tx_mpwqe' to control the feature from userspace. Feature is set ON by default, if supported. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | | net/mlx5e: XDP, Support Enhanced Multi-Packet TX WQETariq Toukan2018-12-214-27/+173
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for the HW feature of multi-packet WQE in XDP xmit flow. The conventional TX descriptor (WQE, Work Queue Element) serves a single packet. Our HW has support for multi-packet WQE (MPWQE) in which a single descriptor serves multiple TX packets. This reduces both the PCI overhead and the CPU cycles wasted on writing them. In this patch we add support for the HW feature, which is supported starting from ConnectX-5. Performance: Tested packet rate for UDP 64Byte multi-stream over ConnectX-5 NICs. CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz XDP_TX: We see a huge gain on single port ConnectX-5, and reach the 100 Mpps milestone. * Single-port HCA: Before: 70 Mpps After: 100 Mpps (+42.8%) * Dual-port HCA: Before: 51.7 Mpps After: 57.3 Mpps (+10.8%) * In both cases we tested traffic on one port and for now On Dual-port HCAs we see only small gain, we are working to overcome this bottleneck, but for the moment only with experimental firmware on dual port HCAs we can reach the wanted numbers as seen on Single-port HCAs. XDP_REDIRECT: Redirect from (A) ConnectX-5 to (B) ConnectX-5. Due to a setup limitation, (A) and (B) are on different NUMA nodes, so absolute performance numbers are not optimal. Note: Below is the transmit rate of (B), not the redirect rate of (A) which is in some cases higher. * (B) is single-port: Before: 77 Mpps After: 90 Mpps (+16.8%) * (B) is dual-port: Before: 61 Mpps After: 72 Mpps (+18%) Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | | net/mlx5e: XDP, Add array for WQE info descriptorsTariq Toukan2018-12-213-21/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Each xdp_wqe_info instance describes the number of data-segments and WQEBBs of the WQE. This is useful for a downstream patch that adds support for Multi-Packet TX WQE feature. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | | net/mlx5e: XDP, Maintain a FIFO structure for xdp_info instancesTariq Toukan2018-12-214-24/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This provides infrastructure to have multiple xdp_info instances for the same consumer index. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | | net/mlx5e: XDP, Replace boolean doorbell indication with segment pointerTariq Toukan2018-12-213-18/+9Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of calculating the control segment to be used upon an XDP xmit doorbell, save it in SQ structure. Nullify when no pending doorbell. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | | net/mlx5e: XDP, Warn upon polling an error CQETariq Toukan2018-12-211-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do not ignore the CQE opcode. This helps expose issues and debug them. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | | net/mlx5e: XDP, Change the XDP SQ redirect indicationTariq Toukan2018-12-215-26/+18Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do not maintain an SQ state bit to indicate whether an XDP SQ serves redirect operations. Instead, rely on the fact that such an XDP SQ doesn't reside in an RQ instance, while the others do. This info is not known to the XDP SQ functions themselves, and they rely on their callers to distinguish between the cases. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | | net/mlx5e: XDP, Precede XDP-related operations in RQ poll by a loaded ↵Tariq Toukan2018-12-213-11/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | program check At the end of the RQ polling loop, some XDP-related operations might be required. Before checking them one by one, check if an XDP program is even loaded. Combine all the checks and operations in a single function in xdp files. This saves unnecessary checks for non-XDP flows. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | | net/mlx5e: TX, Print opcode in error CQE warningTariq Toukan2018-12-211-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The opcode indicates about the error reason. Printing it helps in debug. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | | Merge tag 'mlx5-updates-2018-12-19' of ↵David S. Miller2018-12-219-103/+177
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2018-12-19 This series adds some misc updates and the support for tunnels over VLAN tc offloads. From Miroslav Lichvar, patches #1,2 1) Update timecounter at least twice per counter overflow 2) Extend PTP gettime function to read system clock From Gavi Teitz, patch #3 3) Increase VF representors' SQ size to 128 From Eli Britstein and Or Gerlitz, patches #4-10 4) Adds the capability to support tunnels over VLAN device. Patch 4 avoids crash for TC flow with egress upper devices Patch 5 refactors tunnel routing devs into a helper function Patch 6 avoids crash for TC encap flows with vlan on underlay Patches 7-8 refactor encap tunnel header preparing code. Patch 9 adds support for building VLAN tagged ETH header. Patch 10 adds support for tunnel routing to VLAN device. From Aviv, patches 11,12 to fix earlier VF lag series 5) Fix query_nic_sys_image_guid() error during init 6) Fix LAG requirement when CONFIG_MLX5_ESWITCH is off ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net/mlx5: Fix LAG requirement when CONFIG_MLX5_ESWITCH is offAviv Heller2018-12-201-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If CONFIG_MLX5_ESWITCH is not defined, test for SR-IOV being disabled, instead of calling e-switch LAG prereq routine. Since LAG with SRIOV is allowed only when switchdev mode is on. Fixes: eff849b2c669 ("net/mlx5: Allow/disallow LAG according to pre-req only") Signed-off-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | net/mlx5: Fix query_nic_sys_image_guid() error during initAviv Heller2018-12-201-3/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vport system image guid should be queried using vport nic API for Ethernet ports, and vport hca API for Infiniband ports. Fixes: fadd59fc50d0 ("net/mlx5: Introduce inter-device communication mechanism") Signed-off-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | net/mlx5e: Support tunnel encap over tagged EthernetEli Britstein2018-12-201-20/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Generate encap header depending on the routed device to support native/tagged Ethernet header. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | net/mlx5e: Support VLAN encap ETH header generationEli Britstein2018-12-201-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Support generation of native or tagged Ethernet header for encap header, depending on provided net device. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | net/mlx5e: Re-order route and encap header memory allocationEli Britstein2018-12-201-28/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change the order to first route IPv4/6 and return if error. Only after successful route continue to allocate an encap header, with no functional change. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | net/mlx5e: Tunnel encap ETH header helper functionEli Britstein2018-12-201-12/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In tunnel encap we prepare the encap header for IPv4/6 cases, in two separate functions. For ETH header generation the code is almost duplicated. Move the ETH header generation code from IPv4/6 functions to a helper function, with no functional change. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | net/mlx5e: Fail attempt to offload e-switch TC encap flows with vlan on underlayEli Britstein2018-12-201-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we don't support nor fail attempts to offload encap flows routed to vlan device on the underlay network. We wrongly consider a vlan underlay device to be on the same e-switch b/c the switchdev ID is retrieved recursively. Add explicit check for that and fail such attempts. Also align to a more strict check for the ingress and the underlay devices to practically be on the same eswitch. Fixes: ce99f6b97fcd ('net/mlx5e: Support SRIOV TC encapsulation offloads for IPv6 tunnels') Fixes: 3e621b19b0bb ('net/mlx5e: Support TC encapsulation offloads with upper devices') Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | net/mlx5e: Tunnel routing output devs helper functionEli Britstein2018-12-201-36/+34Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For tunnel we determine the output devs for IPv4/6 cases, in two separate functions, with a duplicated code. Move that code from IPv4/6 functions to a helper function, with no functional change. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | net/mlx5e: Fail attempt to offload e-switch TC flows with egress upper devicesEli Britstein2018-12-203-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We use the switchdev parent HW id helper to identify if the mirred device shares the same ASIC/port with the ingress device. This can get us wrong in the presence of upper devices such as vlan or bridge set over the HW devices (VF or uplink representors), b/c the switchdev ID is retrieved recursively. To fail offload attempts in such cases, we condition the check on the egress device to have not only the same switchdev ID but also the relevant mlx5 netdev ops. Fixes: 03a9d11e6eeb ('net/mlx5e: Add TC drop and mirred/redirect action parsing for SRIOV offloads') Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | net/mlx5e: Allow vlans on e-switch uplink repsOr Gerlitz2018-12-201-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are cases (e.g tunneling with vlan on underlay and potentially more) where this makes sense, so allow that. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | net/mlx5e: Increase VF representors' SQ size to 128Gavi Teitz2018-12-201-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The default size for the VF representors' SQ was too small to handle high packet rates. Doubling the size from 64 to 128 drastically improves the packet rate under stress (by about 50%), whereas increasing the size beyond 128 has not shown to make any further difference. The impact of the SQ size was measured with UDP traffic, in the following topology: TG <-> PF <-> TC forwarding <-> VF representor <-> VF in VM over a single core processing bi-directional traffic, with the following results: SQ size of 64: SQ size of 128: Packet rate for 64B UDP packets: 860 [Kpps] 1280 [Kpps] Packet rate for 114B VxLan encapsulated UDP packets: 320 [Kpps] 500 [Kpps] Signed-off-by: Gavi Teitz <gavi@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | mlx5: extend PTP gettime function to read system clockMiroslav Lichvar2018-12-203-11/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Read the system time right before and immediately after reading the low register of the internal timer. This adds support for the PTP_SYS_OFFSET_EXTENDED ioctl. Cc: Richard Cochran <richardcochran@gmail.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | mlx5: update timecounter at least twice per counter overflowMiroslav Lichvar2018-12-201-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The timecounter needs to be updated at least once in half of the cyclecounter interval to prevent timecounter_cyc2time() interpreting a new timestamp as an old value and causing a backward jump. This would be an issue if the timecounter multiplier was so small that the update interval would not be limited by the 64-bit overflow in multiplication. Shorten the calculated interval to make sure the timecounter is updated in time even when the system clock is slowed down by up to 10%, the multiplier is increased by up to 10%, and the scheduled overflow check is late by 15%. Cc: Richard Cochran <richardcochran@gmail.com> Cc: Ariel Levkovich <lariel@mellanox.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | | | mlxsw: spectrum: Remove limitation regarding VID 1Ido Schimmel2018-12-211-5/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | VID 1 is not reserved anymore, so remove the check that prevented the creation of VLAN devices with this VID over mlxsw ports. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | mlxsw: spectrum: Switch to VID 4095 as default VIDIdo Schimmel2018-12-213-21/+11Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no need to abuse VID 1 anymore and we can instead use VID 4095 as the default VLAN, which will be configured on the port throughout its lifetime. The OVS join / leave functions are changed to enable VIDs 1-4094 (inclusive) instead of 2-4095. This because VID 4095 is now the default VLAN instead of 1. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | mlxsw: spectrum: Add an helper function to cleanup VLAN entriesIdo Schimmel2018-12-211-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | VLAN entries on a port can be associated with either a bridge VLAN or a router port. Before the VLAN entry is destroyed these associations need to be cleaned up. Currently, this is always invoked from the function which destroys the VLAN entry, but next patch is going to skip the destruction of the default entry when a port in unlinked from a LAG. The above does not mean that the associations should not be cleaned up, so add a helper that will be invoked from both call sites. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | mlxsw: spectrum: Store pointer to default port VLAN in port structIdo Schimmel2018-12-212-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Subsequent patches will need to access the default port VLAN. Since this VLAN will exist throughout the lifetime of the port, simply store it in the port's struct. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | mlxsw: spectrum: Allow controlling destruction of default port VLANIdo Schimmel2018-12-211-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function allows flushing all the existing VLAN entries on a port. It is invoked when a port is destroyed and when it is unlinked from a LAG. In the latter case, when moving to the new default VLAN, there will not be a need to destroy the default VLAN entry. Therefore, add an argument that allows to control whether the default port VLAN should be destroyed or not. Currently it is always set to 'true'. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | mlxsw: spectrum: Set PVID during port initializationIdo Schimmel2018-12-211-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the driver does not set the port's PVID when initializing a new port. This is because the driver is using VID 1 as PVID which is the firmware default. Subsequent patches are going to change the PVID the driver is setting when initializing a new port. Prepare for that by explicitly setting the port's PVID. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | mlxsw: spectrum: Replace hard-coded default VID with a defineIdo Schimmel2018-12-214-15/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Subsequent patches are going to replace the current default VID (1) with VLAN_N_VID - 1 (4095). Prepare for this conversion by replacing the hard-coded '1' with a define. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | mlxsw: spectrum_router: Do not force specific configuration orderIdo Schimmel2018-12-213-2/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In symmetric routing, the only two members in the VLAN corresponding to the L3 VNI are the router port and the VXLAN tunnel. In case the VXLAN device is already enslaved to the bridge and only later the VLAN interface is configured, the tunnel will not be offloaded. The reason for this is that when the router interface (RIF) corresponding to the VLAN interface is configured, it calls the core fid_get() API which does not check if NVE should be enabled on the FID. Instead, call into the bridge code which will check if NVE should be enabled on the FID. This effectively means that the same code path is used to retrieve a FID when either a local port or a router port joins the FID. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2018-12-2013-48/+72
|\ \ \ \ \ | |/ / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Lots of conflicts, by happily all cases of overlapping changes, parallel adds, things of that nature. Thanks to Stephen Rothwell, Saeed Mahameed, and others for their guidance in these resolutions. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net/mlx5e: Remove the false indication of software timestamping supportAlaa Hleihel2018-12-191-8/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mlx5 driver falsely advertises support of software timestamping. Fix it by removing the false indication. Fixes: ef9814deafd0 ("net/mlx5e: Add HW timestamping (TS) support") Signed-off-by: Alaa Hleihel <alaa@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | net/mlx5: Typo fix in del_sw_hw_ruleYuval Avnery2018-12-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Expression terminated with "," instead of ";", resulted in set_fte getting bad value for modify_enable_mask field. Fixes: bd5251dbf156 ("net/mlx5_core: Introduce flow steering destination of type counter") Signed-off-by: Yuval Avnery <yuvalav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | net/mlx5e: RX, Fix wrong early return in receive queue pollTariq Toukan2018-12-191-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the completion queue of the RQ is empty, do not immediately return. If left-over decompressed CQEs (from the previous cycle) were processed, need to go to the finalization part of the poll function. Bug exists only when CQE compression is turned ON. This solves the following issue: mlx5_core 0000:82:00.1: mlx5_eq_int:544:(pid 0): CQ error on CQN 0xc08, syndrome 0x1 mlx5_core 0000:82:00.1 p4p2: mlx5e_cq_error_event: cqn=0x000c08 event=0x04 Fixes: 4b7dfc992514 ("net/mlx5e: Early-return on empty completion queues") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | mlxsw: spectrum_nve: Fix memory leak upon driver reloadIdo Schimmel2018-12-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The pointer was NULLed before freeing the memory, resulting in a memory leak. Trace from kmemleak: unreferenced object 0xffff88820ae36528 (size 512): comm "devlink", pid 5374, jiffies 4295354033 (age 10829.296s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000a43f5195>] kmem_cache_alloc_trace+0x1be/0x330 [<00000000312f8140>] mlxsw_sp_nve_init+0xcb/0x1ae0 [<0000000009201d22>] mlxsw_sp_init+0x1382/0x2690 [<000000007227d877>] mlxsw_sp1_init+0x1b5/0x260 [<000000004a16feec>] __mlxsw_core_bus_device_register+0x776/0x1360 [<0000000070ab954c>] mlxsw_devlink_core_bus_device_reload+0x129/0x220 [<00000000432313d5>] devlink_nl_cmd_reload+0x119/0x1e0 [<000000003821a06b>] genl_family_rcv_msg+0x813/0x1150 [<00000000d54d04c0>] genl_rcv_msg+0xd1/0x180 [<0000000040543d12>] netlink_rcv_skb+0x152/0x3c0 [<00000000efc4eae8>] genl_rcv+0x2d/0x40 [<00000000ea645603>] netlink_unicast+0x52f/0x740 [<00000000641fca1a>] netlink_sendmsg+0x9c7/0xf50 [<00000000fed4a4b8>] sock_sendmsg+0xbe/0x120 [<00000000d85795a9>] __sys_sendto+0x397/0x620 [<00000000c5f84622>] __x64_sys_sendto+0xe6/0x1a0 Fixes: 6e6030bd5412 ("mlxsw: spectrum_nve: Implement common NVE core") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | mlxsw: spectrum: Add trap for decapsulated ARP packetsIdo Schimmel2018-12-182-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After a packet was decapsulated it is classified to the relevant FID based on its VNI and undergoes L2 forwarding. Unlike regular (non-encapsulated) ARP packets, Spectrum does not trap decapsulated ARP packets during L2 forwarding and instead can only trap such packets in the underlay router during decapsulation. Add this missing packet trap, which is required for VXLAN routing when the MAC of the target host is not known. Fixes: b02597d513a9 ("mlxsw: spectrum: Add NVE packet traps") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | mlxsw: core: Increase timeout during firmware flash processShalom Toledo2018-12-183-2/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During the firmware flash process, some of the EMADs get timed out, which causes the driver to send them again with a limit of 5 retries. There are some situations in which 5 retries is not enough and the EMAD access fails. If the failed EMAD was related to the flashing process, the driver fails the flashing. The reason for these timeouts during firmware flashing is cache misses in the CPU running the firmware. In case the CPU needs to fetch instructions from the flash when a firmware is flashed, it needs to wait for the flashing to complete. Since flashing takes time, it is possible for pending EMADs to timeout. Fix by increasing EMADs' timeout while flashing firmware. Fixes: ce6ef68f433f ("mlxsw: spectrum: Implement the ethtool flash_device callback") Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net/mlx5e: Cancel DIM work on close SQTal Gilboa2018-12-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TXQ SQ closure is followed by closing the corresponding CQ. A pending DIM work would try to modify the now non-existing CQ. This would trigger an error: [85535.835926] mlx5_core 0000:af:00.0: mlx5_cmd_check:769:(pid 124399): MODIFY_CQ(0x403) op_mod(0x0) failed, status bad resource state(0x9), syndrome (0x1d7771) Fix by making sure to cancel any pending DIM work before destroying the SQ. Fixes: cbce4f444798 ("net/mlx5e: Enable adaptive-TX moderation") Signed-off-by: Tal Gilboa <talgi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | net/mlx5e: Remove unused UDP GSO remaining counterMikhael Goikhman2018-12-132-4/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove tx_udp_seg_rem counter from ethtool output, as it is no longer being updated in the driver's data flow. Fixes: 3f44899ef2ce ("net/mlx5e: Use PARTIAL_GSO for UDP segmentation") Signed-off-by: Mikhael Goikhman <migo@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | net/mlx5e: Avoid encap flows deletion attempt the 1st time a neigh is resolvedOr Gerlitz2018-12-132-6/+5Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, we are deleting offloaded encap flows in case the relevant neigh becomes unconnected while the encap is valid (a sign that it used to be connected), or if the curr neigh mac is different from the cached mac (a sign that the remote side changed their mac). The 2nd check also applies when the neigh becomes connected on the 1st time (we start with zero mac). Before the offending commit, the deleting handler was practically no op, as no flows were offloaded. But since that commit, we offload neigh-less encap flows to slow path. Under mirroring scheme, we go into the delete handler, attempt to unoffload a mirror rule which was never set (as we were offloading to slow path) and crash. Fix that by calling the delete handler only when the encap is valid, which covers both cases mentioned above. Fixes: 5dbe906ff1d5 ('net/mlx5e: Use a slow path rule instead if vxlan neighbour isn't available') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | net/mlx5e: Properly initialize flow attributes for slow path eswitch rule ↵Or Gerlitz2018-12-131-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | deletion When a neighbour is resolved, we delete the goto slow path rule from HW. The eswitch flow attributes where not properly initialized on that case, hence we mess up the eswitch refcounts for chain zero (the default one). Fix that along with making sure to use semicolons and not commas on that code; Fixes: 5dbe906ff1d5 ('net/mlx5e: Use a slow path rule instead if vxlan neighbour isn't available') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>