summaryrefslogtreecommitdiffstats
path: root/drivers/net/ethernet/mellanox
Commit message (Collapse)AuthorAgeFilesLines
...
* net/mlx5: Add interface to hold and release core resourcesMoni Shoua2018-11-121-0/+16
| | | | | | | | | Sometimes upper layers may want to prevent the destruction of a core resource for a period of time while work on that resource is in progress. Add API to support this. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* net/mlx5: Release resource on error flowMoni Shoua2018-11-121-2/+2
| | | | | | | | | | Fix reference counting leakage when the event handler aborts due to an unsupported event for the resource type. Fixes: a14c2d4beee5 ("net/mlx5_core: Warn on unsupported events of QP/RQ/SQ") Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* mlxsw: spectrum: Set minimum shaper on MC TCsPetr Machata2018-10-311-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | An MC-aware mode was introduced in commit 7b8195306694 ("mlxsw: spectrum: Configure MC-aware mode on mlxsw ports"). In MC-aware mode, BUM traffic gets a special treatment by being assigned to a separate set of traffic classes 8..15. Pairs of TCs 0 and 8, 1 and 9, etc., are then configured to strictly prioritize the lower-numbered ones. The intention is to prevent BUM traffic from flooding the switch and push out all UC traffic, which would otherwise happen, and instead give UC traffic precedence. However strictly prioritizing UC traffic has the effect that UC overload pushes out all BUM traffic, such as legitimate ARP queries. These packets are kept in queues for a while, but under sustained UC overload, their lifetime eventually expires and these packets are dropped. That is detrimental to network performance as well. Therefore configure the MC TCs (8..15) with minimum shaper of 200Mbps (a minimum permitted value) to allow a trickle of necessary control traffic to get through. Fixes: 7b8195306694 ("mlxsw: spectrum: Configure MC-aware mode on mlxsw ports") Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* mlxsw: reg: QEEC: Add minimum shaper fieldsPetr Machata2018-10-311-1/+21
| | | | | | | | | | | | | | | | Add QEEC.mise (minimum shaper enable) and QEEC.min_shaper_rate to enable configuration of minimum shaper. Increase the QEEC length to 0x20 as well: that's the length that the register has had for a long time now, but with the configurations that mlxsw typically exercises, the firmware tolerated 0x1C-sized packets. With mise=true however, FW rejects packets unless they have the full required length. Fixes: b9b7cee40579 ("mlxsw: reg: Add QoS ETS Element Configuration register") Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/mlx5e: fix csum adjustments caused by RXFCSEric Dumazet2018-10-311-36/+9Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As shown by Dmitris, we need to use csum_block_add() instead of csum_add() when adding the FCS contribution to skb csum. Before 4.18 (more exactly commit 88078d98d1bb "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"), the whole skb csum was thrown away, so RXFCS changes were ignored. Then before commit d55bef5059dd ("net: fix pskb_trim_rcsum_slow() with odd trim offset") both mlx5 and pskb_trim_rcsum_slow() bugs were canceling each other. Now we fixed pskb_trim_rcsum_slow() we need to fix mlx5. Note that this patch also rewrites mlx5e_get_fcs() to : - Use skb_header_pointer() instead of reinventing it. - Use __get_unaligned_cpu32() to avoid possible non aligned accesses as Dmitris pointed out. Fixes: 902a545904c7 ("net/mlx5e: When RXFCS is set, add FCS data into checksum calculation") Reported-by: Paweł Staszewski <pstaszewski@itcare.pl> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Eran Ben Elisha <eranbe@mellanox.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Dimitris Michailidis <dmichail@google.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Paweł Staszewski <pstaszewski@itcare.pl> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Tested-By: Maria Pasechnik <mariap@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/mlx4_en: add a missing <net/ip.h> includeEric Dumazet2018-10-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | Abdul Haleem reported a build error on ppc : drivers/net/ethernet/mellanox/mlx4/en_rx.c:582:18: warning: `struct iphdr` declared inside parameter list [enabled by default] struct iphdr *iph) ^ drivers/net/ethernet/mellanox/mlx4/en_rx.c:582:18: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default] drivers/net/ethernet/mellanox/mlx4/en_rx.c: In function get_fixed_ipv4_csum: drivers/net/ethernet/mellanox/mlx4/en_rx.c:586:20: error: dereferencing pointer to incomplete type __u8 ipproto = iph->protocol; ^ Fixes: 55469bc6b577 ("drivers: net: remove <net/busy_poll.h> inclusion when not needed") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* mlxsw: core: Fix devlink unregister flowShalom Toledo2018-10-301-7/+17
| | | | | | | | | | | | | | | | | After a failed reload, the driver is still registered to devlink, its devlink instance is still allocated and the 'reload_fail' flag is set. Then, in the next reload try, the driver's allocated devlink instance will be freed without unregistering from devlink and its components (e.g, resources). This scenario can cause a use-after-free if the user tries to execute command via devlink user-space tool. Fix by not freeing the devlink instance during reload (failed or not). Fixes: 24cc68ad6c46 ("mlxsw: core: Add support for reload") Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* mlxsw: spectrum_switchdev: Don't ignore deletions of learned MACsPetr Machata2018-10-301-2/+0Star
| | | | | | | | | | | | Demands to remove FDB entries should be honored even if the FDB entry in question was originally learned, and not added by the user. Therefore ignore the added_by_user datum for SWITCHDEV_FDB_DEL_TO_DEVICE. Fixes: 816a3bed9549 ("switchdev: Add fdb.added_by_user to switchdev notifications") Signed-off-by: Petr Machata <petrm@mellanox.com> Suggested-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* drivers: net: remove <net/busy_poll.h> inclusion when not neededEric Dumazet2018-10-263-3/+0Star
| | | | | | | | | | | | Drivers using generic NAPI interface no longer need to include <net/busy_poll.h>, since busy polling was moved to core networking stack long ago. See commit 79e7fff47b7b ("net: remove support for per driver ndo_busy_poll()") for reference. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/mlx5: Allocate enough space for the FDB sub-namespacesDan Carpenter2018-10-231-1/+1
| | | | | | | | | | | FDB_MAX_CHAIN is three. We wanted to allocate enough memory to hold four structs but there are missing parentheses so we only allocate enough memory for three structs and the first byte of the fourth one. Fixes: 328edb499f99 ("net/mlx5: Split FDB fast path prio to multiple namespaces") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2018-10-1910-44/+45
|\ | | | | | | | | | | | | | | | | | | | | | | | | net/sched/cls_api.c has overlapping changes to a call to nlmsg_parse(), one (from 'net') added rtm_tca_policy instead of NULL to the 5th argument, and another (from 'net-next') added cb->extack instead of NULL to the 6th argument. net/ipv4/ipmr_base.c is a case of a bug fix in 'net' being done to code which moved (to mr_table_dump)) in 'net-next'. Thanks to David Ahern for the heads up. Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: core: Fix use-after-free when flashing firmware during initIdo Schimmel2018-10-183-6/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the switch driver (e.g., mlxsw_spectrum) determines it needs to flash a new firmware version it resets the ASIC after the flashing process. The bus driver (e.g., mlxsw_pci) then registers itself again with mlxsw_core which means (among other things) that the device registers itself again with the hwmon subsystem again. Since the device was registered with the hwmon subsystem using devm_hwmon_device_register_with_groups(), then the old hwmon device (registered before the flashing) was never unregistered and was referencing stale data, resulting in a use-after free. Fix by removing reliance on device managed APIs in mlxsw_hwmon_init(). Fixes: c86d62cc410c ("mlxsw: spectrum: Reset FW after flash") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Alexander Petrovskiy <alexpe@mellanox.com> Tested-by: Alexander Petrovskiy <alexpe@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge tag 'mlx5-fixes-2018-10-10' of ↵David S. Miller2018-10-167-38/+28Star
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Mellanox, mlx5 fixes 2018-10-10 This pull request includes some fixes to mlx5 driver, Please pull and let me know if there's any problem. For -stable v4.11: ('net/mlx5: Take only bit 24-26 of wqe.pftype_wq for page fault type') For -stable v4.17: ('net/mlx5: Fix memory leak when setting fpga ipsec caps') For -stable v4.18: ('net/mlx5: WQ, fixes for fragmented WQ buffers API') ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net/mlx5: WQ, fixes for fragmented WQ buffers APITariq Toukan2018-10-115-32/+23Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mlx5e netdevice used to calculate fragment edges by a call to mlx5_wq_cyc_get_frag_size(). This calculation did not give the correct indication for queues smaller than a PAGE_SIZE, (broken by default on PowerPC, where PAGE_SIZE == 64KB). Here it is replaced by the correct new calls/API. Since (TX/RX) Work Queues buffers are fragmented, here we introduce changes to the API in core driver, so that it gets a stride index and returns the index of last stride on same fragment, and an additional wrapping function that returns the number of physically contiguous strides that can be written contiguously to the work queue. This obsoletes the following API functions, and their buggy usage in EN driver: * mlx5_wq_cyc_get_frag_size() * mlx5_wq_cyc_ctr2fragix() The new API improves modularity and hides the details of such calculation for mlx5e netdevice and mlx5_ib rdma drivers. New calculation is also more efficient, and improves performance as follows: Packet rate test: pktgen, UDP / IPv4, 64byte, single ring, 8K ring size. Before: 16,477,619 pps After: 17,085,793 pps 3.7% improvement Fixes: 3a2f70331226 ("net/mlx5: Use order-0 allocations for all WQ types") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * net/mlx5: Take only bit 24-26 of wqe.pftype_wq for page fault typeHuy Nguyen2018-10-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The HW spec defines only bits 24-26 of pftype_wq as the page fault type, use the required mask to ensure that. Fixes: d9aaed838765 ("{net,IB}/mlx5: Refactor page fault handling") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * net/mlx5: Fix memory leak when setting fpga ipsec capsTalat Batheesh2018-10-111-5/+4Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allocated memory for context should be freed once finished working with it. Fixes: d6c4f0298cec ("net/mlx5: Refactor accel IPSec code") Signed-off-by: Talat Batheesh <talatb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | net/mlx5e: Added 'raw_errors_laneX' fields to ethtool statisticsShay Agroskin2018-10-181-5/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | These are counters for errors received on rx side, such as FEC errors. Signed-off-by: Shay Agroskin <shayag@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | net/mlx5e: Ethtool driver callback for query/set FEC policyShay Agroskin2018-10-181-2/+126
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Driver callback function for 'ethtool --show-fec', 'ethtool --set-fec' commands. The query function returns active and configured FEC policy for current link speed. The set function sets FEC policy for all supported link speeds. 1) If current link speed doesn't support requested FEC policy, the function fails. 2) If a different link speed doesn't support requested FEC policy, FEC capbilities for this speed are turned off. Signed-off-by: Shay Agroskin <shayag@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | net/mlx5e: Add port FEC get/set functionsShay Agroskin2018-10-182-0/+220
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Added functions to query and set link FEC policy. To get/set FEC capabilities in PPLM reg we need to query current link speed. 'mlx5_get_fec_speed_field' queries current link speed and returns correct field offset. FEC Query's return value is divided into 'active FEC policy', which is the FEC policy used by the link, and 'configured FEC policy', which is the FEC policy requested by the user. The two values may differ if: 1) FEC policy was configured to 'auto', in which case the active FEC policy would be the default FEC policy for current link speed. 2) FEC policy was changed, but no link reset is performed. In which case, the active FEC policy would become the configured one after a link reset. FEC set function sets FEC policy for all link speeds and perform link reset. 1) If current link speed doesn't support requested FEC policy, the function fails. 2) If a different link speed doesn't support requested FEC policy, FEC capbilities for this speed are turned off and a warning message is printed. Signed-off-by: Shay Agroskin <shayag@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | net/mlx5: Remove counter from idr after removing it from listVlad Buslov2018-10-181-5/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fs_counters list can temporary become unsorted when new counters are created/deleted concurrently. Idr is used to quickly lookup position to insert new counter in logarithmic time. However, if new flows are concurrently inserted during time window when flows with adjacent ids are already removed from idr but are still present in counters list, mlx5_fc_stats_work() observes counters list in inconsistent state, which results following warning: [ 1839.561955] mlx5_core 0000:81:00.0: mlx5_cmd_fc_bulk_get:587:(pid 729): Flow counter id (0x102d5) out of range (0x1c0a8..0x1c10b). Counter ignored. Move idr_remove() call to be executed synchronously with counter deletion from list. Extract this code to mlx5_fc_stats_remove() helper function that is called by workqueue job handler mlx5_fc_stats_work(). Fixes: 12d6066c3b29 ("net/mlx5: Add flow counters idr") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com>
* | | net/mlx5: Take fs_counters dellist before addlistVlad Buslov2018-10-181-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In fs_counters elements from both addlist and dellist are removed by mlx5_fc_stats_work() without any locking. This introduces race condition when batch of new rules is created and then immediately deleted (for example, when error occurred during flow creation). In such case some of the rules might be in dellist, but not in addlist when mlx5_fc_stats_work() is executed concurrently with tc, which will result rule deletion and use-after-free on next iteration because deleted rules are still in addlist. Always take dellist first to guarantee that rules can only be deleted after they were removed from addlist. Fixes: 6e5e22839136 ("net/mlx5: Add new list to store deleted flow counters") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reported-by: Chris Mi <chrism@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com>
* | | net/mlx5: Refactor fragmented buffer struct fields and init flowTariq Toukan2018-10-181-73/+47Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Take struct mlx5_frag_buf out of mlx5_frag_buf_ctrl, as it is not needed to manage and control the datapath of the fragmented buffers API. struct mlx5_frag_buf contains control info to manage the allocation and de-allocation of the fragmented buffer. Its fields are not relevant for datapath, so here I take them out of the struct mlx5_frag_buf_ctrl, except for the fragments array itself. In addition, modified mlx5_fill_fbc to initialise the frags pointers as well. This implies that the buffer must be allocated before the function is called. A set of type-specific *_get_byte_size() functions are replaced by a generic one. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | Merge tag 'mlx5-updates-2018-10-17' of ↵David S. Miller2018-10-1818-409/+1066
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux mlx5-updates-2018-10-17 ======================================================================== From Or Gerlitz <ogerlitz@mellanox.com>: This series from Paul adds support to mlx5 e-switch tc offloading of multiple priorities and chains. This is made of four building blocks (along with few minor driver refactors): [1] Split FDB fast path prio to multiple namespaces Currently the FDB name-space contains two priorities, fast path (p0) and slow path (p1). The slow path contains the per representor SQ send-to-vport TX rule and the match-all RX miss rule. As a pre-step to support multi-chains and priorities, we split the FDB fast path to multiple namespaces (sub namespaces), each with multiple priorities. [2] E-Switch chains and priorities A chain is a group of priorities. We use the fdb parallel sub-namespaces to implement chains, and a flow table for each priority in them. Because these namespaces are parallel and in series to the slow path fdb, the chains aren't connected to each other (but to the slow path), and one must use a explicit goto action to reach a different chain. Flow tables for the priorities are created on demand and destroyed once not used. [3] Add a no-append flow insertion mode, use it for TC offloads Enhance the driver fs core, such that if a no-append flag is set by the caller, we add a new FTE, instead of appending the actions of the inserted rule when the same match already exists. For encap rules, we defer the HW offloading till we have a valid neighbor. This can result in the packet hitting a lower priority rule in the HW DP. Use the no-append API to push these packets to the slow path FDB table, so they go to the TC kernel DP as done before priorities where supported. [4] Offloading tc priorities and chains for eswitch flows Using [1], [2] and [3] above we add the support for offloading both chains and priorities. To get to a new chain, use the tc goto action. We support a fixed prio range 1-16, and chains 0-3. ============================================================================= Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net/mlx5e: Support offloading tc priorities and chains for eswitch flowsPaul Blakey2018-10-174-12/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we fail when user specify a non-zero chain, this patch adds the support for it and tc priorities. To get to a new chain, use the tc goto action. Currently we support a fixed prio range 1-16, and chain range 0-3. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | net/mlx5e: Use a slow path rule instead if vxlan neighbour isn't availablePaul Blakey2018-10-171-22/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When adding a vxlan tc rule, and a neighbour isn't available, we don't insert any rule to hardware. Once we enable offloading flows with multiple priorities, a packet that should have matched this rule will continue in hardware pipeline and might match a wrong one. This is unlike in tc software path where it will be matched and forwarded to the vxlan device (which will cause a ARP lookup eventually) and stop processing further tc filters. To address that, when when a neighbour isn't available (EAGAIN from attach_encap), or gets deleted, change the original action to be a forward to slow path instead. Neighbour update will restore the original action once the neighbour becomes available. This will be done atomically so at any given time we will have a the correct match. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | net/mlx5: E-Switch, Enable setting goto slow path chain actionPaul Blakey2018-10-172-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A pre-step for the tc offloads code to use this when a neigh is not available for encap rules. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | net/mlx5e: Avoid duplicated code for tc offloads add/del fdb ruleOr Gerlitz2018-10-171-41/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code for adding/deleting fdb flow is repeated when user-space does flow add/del and when we add/del from the neigh update path - unify them to avoid the duplication. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | net/mlx5e: For TC offloads, always add new flow instead of appending the actionsPaul Blakey2018-10-172-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When replacing a tc flower rule, flower first requests to add the new rule (new action), then deletes the old one. But currently when asked to add a new tc flower flow, we append the actions (and counters to it). This can result in a fte with two flow counters or conflicting actions (drop and encap action) which firmware complains/errs about and isn't achieving what the user aimed for. Instead, insert the flow using the new no-append flag which will add a new HW rule, the old flow and rule will be deleted later by flower Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanmox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | net/mlx5: Add a no-append flow insertion modePaul Blakey2018-10-173-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If no-append flag is set, we will add a new FTE, instead of appending the actions of the inserted rule when the same match already exists. While here, move the has_flow_tag boolean indicator to be a flag too. This patch doesn't change any functionality. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanmox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | net/mlx5: E-Switch, Add chains and prioritiesPaul Blakey2018-10-173-86/+339
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A chain is a group of priorities, so use the fdb parallel sub namespaces to implement chains, and a flow table for each priority in them. Because these namespaces are parallel and in series to the slow path fdb, the chains aren't connected to one another (but to the slow path), and one must use a explicit goto action to reach a different chain. Flow tables for the priorities will be created on demand and destroyed once not used. The Firmware has four pools of tables for sizes S/XS/M/L (4k, 64k, 1m, 4m). We maintain ghost copies of the pools occupancy. When a new table is to be created, we scan the pools from large to small and find the 1st table size which can be now created. When a table is destroyed, we update the relevant pool. Multi chain/prio isn't enabled yet by this patch, for now all flows will use the default chain 0, and prio 1. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | net/mlx5: E-Switch, Have explicit API to delete fwd rulesOr Gerlitz2018-10-173-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Be symmetric with the e-switch API to add rules which has a specific function to add fwd rules which are used as part of vport mirroring. This patch doesn't change any functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | net/mlx5: Split FDB fast path prio to multiple namespacesPaul Blakey2018-10-174-11/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Towards supporting multi-chains and priorities, split the FDB fast path to multiple namespaces (sub namespaces), each with multiple priorities. This patch adds a new flow steering type, FS_TYPE_PRIO_CHAINS, which is like current FS_TYPE_PRIO, but may contain only namespaces, and those will be in parallel to one another in terms of managing of the flow tables connections inside them. Meaning, while searching for the next or previous flow table to connect for a new table inside such namespace we skip the parallel namespaces in the same level under the FS_TYPE_PRIO_CHAINS prio we originated from. We use this new type for splitting the fast path prio into multiple parallel namespaces, each containing normal prios. The prios inside them (and their tables) will be connected to one another, but not from one parallel namespace to another, instead the last prio in each namespace will be connected to the next prio in the containing FDB namespace, which is the slow path prio. Signed-off-by: Paul Blakey <paulb@mellanox.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | net/mlx5e: Split TC add rule path for nic vs e-switchRoi Dayan2018-10-171-47/+138
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move to have clear separation on the code path to add nic vs e-switch flows. While here we break the code that deals with adding offloaded TC tool to few smaller stages, each on helper function. Besides getting us simpler and readable code, these are pre-steps for being able to have two HW flows serving one SW TC flow for some e-switch use cases. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | net/mlx5e: Change return type of tc add flow functionsRabie Loulou2018-10-171-47/+39Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Refactor the flow add utility functions to return err code instead of rule pointers. This will allow for simpler logic when one tc rule is duplicated to two HW rules in downstream patches. Signed-off-by: Rabie Loulou <rabiel@mellanox.com> Signed-off-by: Shahar Klein <shahark@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | net/mlx5: Use flow counter IDs and not the wrapping cache objectMark Bloch2018-10-177-16/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, when a flow rule is created using the FS core layer, the caller has to pass the entire flow counter object and not just the counter HW handle (ID). This requires both the FS core and the caller to have knowledge about the inner implementation of the FS layer flow counters cache and limits the possible users. Move to use the counter ID across the place when dealing with flows. Doing this decoupling, now can we privatize the inner implementation of the flow counters. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | net/mlx5: E-Switch, Get counters for offloaded flows from callersMark Bloch2018-10-174-35/+33Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's no real reason for the e-switch logic to manage the creation of counters for offloaded flows. The API already has the directive for the caller to denote they want to attach a counter to the created flow. As such, we go and move the management of flow counters to the mlx5e tc offload logic. This also lets us remove an inelegant interface where the FS layer had to provide a way to retrieve a counter from a flow rule. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | Merge branch 'mlx5-next' of ↵Saeed Mahameed2018-10-1712-132/+233
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux into net-next mlx5 updates for both net-next and rdma-next * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: (21 commits) net/mlx5: Expose DC scatter to CQE capability bit net/mlx5: Update mlx5_ifc with DEVX UID bits net/mlx5: Set uid as part of DCT commands net/mlx5: Set uid as part of SRQ commands net/mlx5: Set uid as part of SQ commands net/mlx5: Set uid as part of RQ commands net/mlx5: Set uid as part of QP commands net/mlx5: Set uid as part of CQ commands net/mlx5: Rename incorrect naming in IFC file net/mlx5: Export packet reformat alloc/dealloc functions net/mlx5: Pass a namespace for packet reformat ID allocation net/mlx5: Expose new packet reformat capabilities {net, RDMA}/mlx5: Rename encap to reformat packet net/mlx5: Move header encap type to IFC header file net/mlx5: Break encap/decap into two separated flow table creation flags net/mlx5: Add support for more namespaces when allocating modify header net/mlx5: Export modify header alloc/dealloc functions net/mlx5: Add proper NIC TX steering flow tables support net/mlx5: Cleanup flow namespace getter switch logic net/mlx5: Add memic command opcode to command checker ... Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | | net/mlx5: Set uid as part of DCT commandsYishai Hadas2018-09-251-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set uid as part of DCT commands so that the firmware can manage the DCT object in a secured way. That will enable using a DCT that was created by verbs application to be used by the DEVX flow in case the uid is equal. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
| | * | | net/mlx5: Set uid as part of SRQ commandsYishai Hadas2018-09-251-3/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set uid as part of SRQ commands so that the firmware can manage the SRQ object in a secured way. That will enable using an SRQ that was created by verbs application to be used by the DEVX flow in case the uid is equal. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
| | * | | net/mlx5: Set uid as part of SQ commandsYishai Hadas2018-09-251-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set uid as part of SQ commands so that the firmware can manage the SQ object in a secured way. That will enable using an SQ that was created by verbs application to be used by the DEVX flow in case the uid is equal. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
| | * | | net/mlx5: Set uid as part of RQ commandsYishai Hadas2018-09-251-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set uid as part of RQ commands so that the firmware can manage the RQ object in a secured way. That will enable using an RQ that was created by verbs application to be used by the DEVX flow in case the uid is equal. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
| | * | | net/mlx5: Set uid as part of QP commandsYishai Hadas2018-09-251-18/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set uid as part of QP commands so that the firmware can manage the QP object in a secured way. That will enable using a QP that was created by verbs application to be used by the DEVX flow in case the uid is equal. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
| | * | | net/mlx5: Set uid as part of CQ commandsYishai Hadas2018-09-251-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set uid as part of CQ commands so that the firmware can manage the CQ object in a secured way. The firmware should mark this CQ with the given uid so that it can be used later on only by objects with the same uid. Upon DEVX flows that use this CQ (e.g. create QP command), the pointed CQ must have the same uid as of the issuer uid command. When a command is issued with uid=0 it means that the issuer of the command is trusted (i.e. kernel), in that case any pointed object can be used regardless of its uid. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
| | * | | net/mlx5: Rename incorrect naming in IFC fileMark Bloch2018-09-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove a trailing underscore from the multicast/unicast names. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
| | * | | net/mlx5: Export packet reformat alloc/dealloc functionsMark Bloch2018-09-052-8/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This will allow for the RDMA side to allocate packet reformat context. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
| | * | | net/mlx5: Pass a namespace for packet reformat ID allocationMark Bloch2018-09-053-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we attach packet reformat actions only to the FDB namespace. In preparation to be able to use that for NIC steering, pass the actual namespace as a parameter. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
| | * | | {net, RDMA}/mlx5: Rename encap to reformat packetMark Bloch2018-09-058-64/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Renames all encap mlx5_{core,ib} code to use the new naming of packet reformat. This change doesn't introduce any function change and is needed to properly reflect the operation being done by this action. For example not only can we encapsulate a packet, but also decapsulate it. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
| | * | | net/mlx5: Move header encap type to IFC header fileMark Bloch2018-09-051-5/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Those bits are hardware specification and should be defined in the IFC header file. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
| | * | | net/mlx5: Break encap/decap into two separated flow table creation flagsMark Bloch2018-09-052-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Today we are able to attach encap and decap actions only to the FDB. In preparation to enable those actions on the NIC flow tables, break the single flag into two. Those flags control whatever a decap or encap operations can be attached to the flow table created. For FDB, if encapsulation is required, we set both of them. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
| | * | | net/mlx5: Add support for more namespaces when allocating modify headerMark Bloch2018-09-051-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are RX and TX flow steering namespaces with different number of actions. Initialize them accordingly. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>