summaryrefslogtreecommitdiffstats
path: root/drivers/net/ethernet/mellanox/mlx5/core
Commit message (Collapse)AuthorAgeFilesLines
...
| * | net/mlx5: Add support for more namespaces when allocating modify headerMark Bloch2018-09-051-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are RX and TX flow steering namespaces with different number of actions. Initialize them accordingly. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
| * | net/mlx5: Export modify header alloc/dealloc functionsMark Bloch2018-09-052-5/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | Those functions will be used by the RDMA side to create modify header actions to be attached to flow steering rules via verbs. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
| * | net/mlx5: Add proper NIC TX steering flow tables supportMark Bloch2018-09-052-13/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend the ability to add steering rules to NIC TX flow tables. For now, we are only adding TX bypass (egress) which is used by the RDMA side. This will allow to shape outgoing traffic and tweak it if needed, for example performing encapsulation or rewriting headers. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
| * | net/mlx5: Cleanup flow namespace getter switch logicMark Bloch2018-09-051-18/+6Star
| | | | | | | | | | | | | | | | | | | | | | | | Refactor the switch logic so it's simpler to follow and understand. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
| * | net/mlx5: Add memic command opcode to command checkerAriel Levkovich2018-09-041-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Adding the alloc/dealloc memic FW command opcodes to avoid "unknown command" prints in the command string converter and internal error status handler. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
* | | net/mlx5e: Do not ignore netdevice TX/RX queues numberFeras Daoud2018-10-117-28/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current design of mlx5e driver ignores the netdevice TX/RX queues number for netdevices that RDMA IPoIB ULP creates. Instead, the queue number is initialized to the maximum number that mlx5 thinks best for performance. As a result, ULP drivers that choose to create a netdevice with queue number that is less than the maximum channels mlx5 creates, will get a memory corruption. This fix changes the mlx5e netdev logic to respect ULP netdevices TX/RX queue number and use it when creating resources instead of the maximum channel number. Fixes: cd565b4b51e5 ("IB/IPoIB: Support acceleration options callbacks") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | net/mlx5e: Use non-delayed work for update statsSaeed Mahameed2018-10-113-11/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert mlx5e update stats work to a normal work structure, since it is never used delayed. Add a helper function to queue update stats work on demand which checks for some conditions and reduce code duplication to have a better abstraction. Fixes: ed56c5193ad8 ("net/mlx5e: Update NIC HW stats on demand only") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | net/mlx5e: Initialize all netdev common structures in one placeSaeed Mahameed2018-10-114-51/+36Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move all mlx5e generic structures initializations to mlx5e_netdev_init. The common structure new initializer function will be used to initialize mlx5 context for netlink created netdevs such as IPoIB mlx5 accelerated child netdevs. Fixes: cd565b4b51e5 ("IB/IPoIB: Support acceleration options callbacks") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Feras Daoud <ferasda@mellanox.com>
* | | net/mlx5e: Always initialize update stats delayed workFeras Daoud2018-10-113-5/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mlx5e_detach_netdev cancels update_stats work which was not initialized in ipoib netdevice profile, as a result, the following assert occurs: ODEBUG: assert_init not available (active state 0) object type: timer_list hint:(null) This change moves the update stats work to be initialized for all mlx5e netdevices. Fixes: cd565b4b51e5 ("IB/IPoIB: Support acceleration options callbacks") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | net/mlx5e: Gather common netdev init/cleanup functionality in one placeFeras Daoud2018-10-116-49/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a helper init/cleanup function that initializes mlx5e generic netdev private structure, and use them from all profiles init/cleanup callbacks. This patch will also be helpful to initialize/cleanup netdevs that are not created by mlx5 driver, e.g: accelerated ipoib child netdevs. Fixes: 26e59d8077a3 ("net/mlx5e: Implement mlx5e interface attach/detach callbacks") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | RDMA/netdev: Hoist alloc_netdev_mqs out of the driverDenis Drozdov2018-10-111-41/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | netdev has several interfaces that expect to call alloc_netdev_mqs from the core code, with the driver only providing the arguments. This is incompatible with the rdma_netdev interface that returns the netdev directly. Thus re-organize the API used by ipoib so that the verbs core code calls alloc_netdev_mqs for the driver. This is done by allowing the drivers to provide the allocation parameters via a 'get_params' callback and then initializing an allocated netdev as a second step. Fixes: cd565b4b51e5 ("IB/IPoIB: Support acceleration options callbacks") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Denis Drozdov <denisd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | Merge tag 'mlx5-updates-2018-10-03' of ↵David S. Miller2018-10-0410-85/+265
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2018-10-03 mlx5 core driver and ethernet netdev updates, please note there is a small devlink releated update to allow extack argument to eswitch operations. From Eli Britstein, 1) devlink: Add extack argument to the eswitch related operations 2) net/mlx5e: E-Switch, return extack messages for failures in the e-switch devlink callbacks 3) net/mlx5e: Add extack messages for TC offload failures From Eran Ben Elisha, 4) mlx5e: Add counter for aRFS rule insertion failures From Feras Daoud 5) Fast teardown support for mlx5 device This change introduces the enhanced version of the "Force teardown" that allows SW to perform teardown in a faster way without the need to reclaim all the FW pages. Fast teardown provides the following advantages: 1- Fix a FW race condition that could cause command timeout 2- Avoid moving to polling mode 3- Close the vport to prevent PCI ACK to be sent without been scatter to memory ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net/mlx5: Add Fast teardown supportFeras Daoud2018-10-044-21/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Today mlx5 devices support two teardown modes: 1- Regular teardown 2- Force teardown This change introduces the enhanced version of the "Force teardown" that allows SW to perform teardown in a faster way without the need to reclaim all the pages. Fast teardown provides the following advantages: 1- Fix a FW race condition that could cause command timeout 2- Avoid moving to polling mode 3- Close the vport to prevent PCI ACK to be sent without been scatter to memory Signed-off-by: Feras Daoud <ferasda@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | net/mlx5e: Add new counter for aRFS rule insertion failuresEran Ben Elisha2018-10-043-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Count aRFS rules insertion failure for ethtool output. In addition, move the error print into debug prints mechanism, as it could flood the dmesg and reduce system BW dramatically. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | net/mlx5e: Add extack messages for TC offload failuresEli Britstein2018-10-041-38/+118
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Return tc extack messages for failures to user space. Messages provide reasons for not being able to offload rules to HW. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | net/mlx5e: E-Switch, Add extack messages to devlink callbacksEli Britstein2018-10-041-14/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Return extack messages for failures in the e-switch devlink callbacks. Messages provide reasons for not being able to issue the operation. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | devlink: Add extack for eswitch operationsEli Britstein2018-10-042-10/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add extack argument to the eswitch related operations. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2018-10-046-5/+74
|\ \ \ \ | |/ / / |/| | / | | |/ | |/| | | | | | | Minor conflict in net/core/rtnetlink.c, David Ahern's bug fix in 'net' overlapped the renaming of a netlink attribute in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net/mlx5e: Set vlan masks for all offloaded TC rulesJianbo Liu2018-10-011-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In flow steering, if asked to, the hardware matches on the first ethertype which is not vlan. It's possible to set a rule as follows, which is meant to match on untagged packet, but will match on a vlan packet: tc filter add dev eth0 parent ffff: protocol ip flower ... To avoid this for packets with single tag, we set vlan masks to tell hardware to check the tags for every matched packet. Fixes: 095b6cfd69ce ('net/mlx5e: Add TC vlan match parsing') Signed-off-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | net/mlx5: E-Switch, Fix out of bound access when setting vport rateEran Ben Elisha2018-10-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code that deals with eswitch vport bw guarantee was going beyond the eswitch vport array limit, fix that. This was pointed out by the kernel address sanitizer (KASAN). The error from KASAN log: [2018-09-15 15:04:45] BUG: KASAN: slab-out-of-bounds in mlx5_eswitch_set_vport_rate+0x8c1/0xae0 [mlx5_core] Fixes: c9497c98901c ("net/mlx5: Add support for setting VF min rate") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | net/mlx5e: Avoid unbounded peer devices when unpairing TC hairpin rulesAlaa Hleihel2018-10-015-3/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the peer device was already unbound, then do not attempt to modify it's resources, otherwise we will crash on dereferencing non-existing device. Fixes: 5c65c564c962 ("net/mlx5e: Support offloading TC NIC hairpin flows") Signed-off-by: Alaa Hleihel <alaa@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | net/mlx5: Cache the system image guidAlaa Hleihel2018-10-012-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The system image guid is a read-only field which is used by the TC offloads code to determine if two mlx5 devices belong to the same ASIC while adding flows. Read this once and save it on the core device rather than querying each time an offloaded flow is added. Signed-off-by: Alaa Hleihel <alaa@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | net/mlx5e: Allow reporting of checksum unnecessaryOr Gerlitz2018-10-014-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we practically never report checksum unnecessary, because for all IP packets we take the checksum complete path. Enable non-default runs with reprorting checksum unnecessary, using an ethtool private flag. This can be useful for performance evals and other explorations. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | net/mlx5e: Enable reporting checksum unnecessary also for L3 packetsOr Gerlitz2018-10-011-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | We can report checksum unnecessary also when the L3 checksum flag on the cqe is set and there's no L4 header. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | net/mlx5e: Add ethtool control of ring params to VF representorsGavi Teitz2018-10-011-0/+18
| | | | | | | | | | | | | | | | | | | | | Added ethtool control to the representors for setting and querying the ring params. Signed-off-by: Gavi Teitz <gavi@mellanox.com>
* | | net/mlx5e: Enable multi-queue and RSS for VF representorsGavi Teitz2018-10-011-11/+129
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Increased the amount of channels the representors can open to be the amount of CPUs. The default amount opened remains one. Used the standard NIC netdev functions to: * Set RSS params when building the representors' params. * Setup an indirect TIR and RQT for the representors upon initialization. * Create a TTC flow table for the representors' indirect TIR (when creating the TTC table, mlx5e_set_ttc_basic_params() is not called, in order to avoid setting the inner_ttc param, which is not needed). Added ethtool control to the representors for setting and querying the amount of open channels. Additionally, included logic in the representors' ethtool set channels handler which controls a representor's vport rx rule, so that if there is one open channel the rx rule steers traffic to the representor's direct TIR, whereas if there is more than one channel, the rx rule steers traffic to the new TTC flow table. Signed-off-by: Gavi Teitz <gavi@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | net/mlx5e: Expose ethtool rss key size / indirection table functionsOr Gerlitz2018-10-012-2/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | Towards enabling RSS for the vport representors, expose the functions for querying the rss hash key size and indirection table size via ethtool. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | net/mlx5e: Expose function for building RSS paramsGavi Teitz2018-10-012-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Towards enabling RSS for the vport representors, extract the procedure for building a device's RSS params, and expose the function. Signed-off-by: Gavi Teitz <gavi@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | net/mlx5e: Provide explicit directive if to create inner indirect tirsOr Gerlitz2018-10-013-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change the driver functions that deal with creating indirect tirs to get a flag telling if inner ttc is desired. A pre-step for enabling rss on the vport representors, where inner ttc is not needed. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | net/mlx5: E-Switch, Provide flow dest when creating vport rx ruleGavi Teitz2018-10-013-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the destination for the representor e-switch rx rule is a TIR number. Towards changing that to potentially be a flow table, as part of enabling RSS for representors, modify the signature of the related e-switch API to get a flow destination. Signed-off-by: Gavi Teitz <gavi@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | net/mlx5e: Extract creation of rep's default flow ruleGavi Teitz2018-10-011-9/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | Cleaning up the flow of the representors' rx initialization, towards enabling RSS for the representors. Signed-off-by: Gavi Teitz <gavi@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | net/mlx5e: Enable stateless offloads for VF representor netdevsGavi Teitz2018-10-011-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enabled checksum and TSO offloads for the representors, in order to increase their performance, which is required to increase the performance of flows that cannot be offloaded. Checksum offloads contribute to a general acceleration of all traffic (to around 150%), whereas the TSO offload contributes to a prominent acceleration of the representor's TX for traffic flows with larger than MTU sized packets (to around 200%). This is the usual case for TCP streams, as the PF, which serves as the uplink representor, and the VF representors employ GRO before forwarding the packets to the representor. GRO was enabled implicitly for the representors beforehand, and is explicitly enabled here to ensure that the representors preserve the performance boost it provides (of around 200%) when working in tandem with the TSO offload by the forwardee, which is the standard case as both the PF and the VF representors employ HW TSO. The impact of these changes can be seen in the following measurements taken on a setup of a VM over a VF, connected to OVS via the VF representor, to an external host: Before current changes: TCP Throughput [Gb/s] External host to VM ~ 10.5 VM to external host ~ 23.5 With just checksum offloads enabled: TCP Throughput [Gb/s] External host to VM ~ 14.9 VM to external host ~ 28.5 With the TSO offload also enabled: TCP Throughput [Gb/s] External host to VM ~ 30.5 Signed-off-by: Gavi Teitz <gavi@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | net/mlx5e: Change VF representors' RQ typeGavi Teitz2018-10-013-18/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The representors' RQ size was not large enough for them to achieve high enough performance, and therefore needed to be enlarged, while suffering a minimum hit to its memory usage. To achieve this the representors RQ size was increased, and its type was changed to be a striding RQ if it is supported. Towards that goal the following changes were made: * Extracted the sequence for setting the standard netdev's RQ parmas into a function * Replaced the sequence for setting the representor's RQ params with the standard sequence The impact of this change can be seen in the following measurements taken on a setup of a VM over a VF, connected to OVS via the VF representor, to an external host: Before current change: TCP Throughput [Gb/s] VM to external host ~ 7.2 With the current change (measured with a striding RQ): TCP Throughput [Gb/s] VM to external host ~ 23.5 Each representor now consumes 2 [MB] of memory for its packet buffers. Signed-off-by: Gavi Teitz <gavi@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | net/mlx5e: Ethtool steering, Support masks for l3/l4 filtersOr Gerlitz2018-10-011-40/+16Star
| | | | | | | | | | | | | | | | | | | | | | | | Allow using partial masks for L3 addresses and L4 ports across the place. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/netDavid S. Miller2018-09-254-22/+4Star
|\| | | | | | | | | | | | | | | | | | | | | | | | | | Version bump conflict in batman-adv, take what's in net-next. iavf conflict, adjustment of netdev_ops in net-next conflicting with poll controller method removal in net. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | mlx5: remove ndo_poll_controllerEric Dumazet2018-09-241-19/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. mlx5 uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net/mlx5e: TLS, Read capabilities only when it is safeSaeed Mahameed2018-09-181-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Read TLS caps from the core driver only when TLS is supported, i.e mlx5_accel_is_tls_device returns true. Fixes: 790af90c00d2 ("net/mlx5e: TLS, build TLS netdev from capabilities") Change-Id: I5f21ff4d684901af487e366a7e0cf032b54ee9cf Reported-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
| * | net/mlx5: Check for SQ and not RQ state when modifying hairpin SQAlaa Hleihel2018-09-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When modifying hairpin SQ, instead of checking if the next state equals to MLX5_SQC_STATE_RDY, we compare it against the MLX5_RQC_STATE_RDY enum value. The code worked since both of MLX5_RQC_STATE_RDY and MLX5_SQC_STATE_RDY have the same value today. This patch fixes this issue. Fixes: 18e568c390c6 ("net/mlx5: Hairpin pair core object setup") Change-Id: I6758aa7b4bd137966ae28206b70648c5bc223b46 Signed-off-by: Alaa Hleihel <alaa@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | net/mlx5: Fix read from coherent memoryEli Cohen2018-09-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use accessor function READ_ONCE to read from coherent memory modified by the device and read by the driver. This becomes most important in preemptive kernels where cond_resched implementation does not have the side effect which guaranteed the updated value. Fixes: 269d26f47f6f ("net/mlx5: Reduce command polling interval") Change-Id: Ie6deeb565ffaf76777b07448c7fbcce3510bbb8a Signed-off-by: Eli Cohen <eli@mellanox.com> Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2018-09-138-56/+75
|\| |
| * | net/mlx5: Fix possible deadlock from lockdep when adding fte to fgRoi Dayan2018-09-061-37/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a false positive report due to incorrect nested lock annotations as we lock multiple fgs with the same subclass. Instead of locking all fgs only lock the one being used as was done before. Fixes: bd71b08ec2ee ("net/mlx5: Support multiple updates of steering rules in parallel") Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | net/mlx5e: Ethtool steering, fix udp source port valueSaeed Mahameed2018-09-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Copy and paste bug was introduced in the offending patch. We need to write udp source port value into the headers value and not headers criteria "mask". Fixes: 142644f8a1f8 ("net/mlx5e: Ethtool steering flow parsing refactoring") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | net/mlx5: Check for error in mlx5_attach_interfaceHuy Nguyen2018-09-061-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, mlx5_attach_interface does not check for error after calling intf->attach or intf->add. When these two calls fails, the client is not initialized and will cause issues such as kernel panic on invalid address in the teardown path (mlx5_detach_interface) Fixes: 737a234bb638 ("net/mlx5: Introduce attach/detach to interface API") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | net/mlx5: Consider PCI domain in search for next devDaniel Jurgens2018-09-061-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PCI BDF is not unique. PCI domain must also be considered when searching for the next physical device during lag setup. Example below: mlx5_core 0000:01:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(128) RxCqeCmprss(0) mlx5_core 0000:01:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(128) RxCqeCmprss(0) mlx5_core 0001:01:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(128) RxCqeCmprss(0) mlx5_core 0001:01:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(128) RxCqeCmprss(0) Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | net/mlx5: Fix not releasing read lock when adding flow rulesRoi Dayan2018-09-061-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If building match list fg fails and we never jumped to search_again_locked label then the function returned without unlocking the read lock. Fixes: bd71b08ec2ee ("net/mlx5: Support multiple updates of steering rules in parallel") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | net/mlx5: E-Switch, Fix memory leak when creating switchdev mode FDB tablesRaed Salem2018-09-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The memory allocated for the slow path table flow group input structure was not freed upon successful return, fix that. Fixes: 1967ce6ea5c8 ("net/mlx5: E-Switch, Refactor fast path FDB table creation in switchdev mode") Signed-off-by: Raed Salem <raeds@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | net/mlx5: Use u16 for Work Queue buffer strides offsetTariq Toukan2018-09-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Minimal stride size is 16. Hence, the number of strides in a fragment (of PAGE_SIZE) is <= PAGE_SIZE / 16 <= 4K. u16 is sufficient to represent this. Fixes: d7037ad73daa ("net/mlx5: Fix QP fragmented buffer allocation") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | net/mlx5: Use u16 for Work Queue buffer fragment sizeTariq Toukan2018-09-062-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Minimal stride size is 16. Hence, the number of strides in a fragment (of PAGE_SIZE) is <= PAGE_SIZE / 16 <= 4K. u16 is sufficient to represent this. Fixes: 388ca8be0037 ("IB/mlx5: Implement fragmented completion queue (CQ)") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | net/mlx5: Fix debugfs cleanup in the device init/remove flowJack Morgenstein2018-09-061-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When initializing the device (procedure init_one), the driver calls mlx5_pci_init to perform pci initialization. As part of this initialization, mlx5_pci_init creates a debugfs directory. If this creation fails, init_one aborts, returning failure to the caller (which is the probe method caller). The main reason for such a failure to occur is if the debugfs directory already exists. This can happen if the last time mlx5_pci_close was called, debugfs_remove (silently) failed due to the debugfs directory not being empty. Guarantee that such a debugfs_remove failure will not occur by instead calling debugfs_remove_recursive in procedure mlx5_pci_close. Fixes: 59211bd3b632 ("net/mlx5: Split the load/unload flow into hardware and software flows") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | net/mlx5: Fix use-after-free in self-healing flowJack Morgenstein2018-09-062-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the mlx5 health mechanism detects a problem while the driver is in the middle of init_one or remove_one, the driver needs to prevent the health mechanism from scheduling future work; if future work is scheduled, there is a problem with use-after-free: the system WQ tries to run the work item (which has been freed) at the scheduled future time. Prevent this by disabling work item scheduling in the health mechanism when the driver is in the middle of init_one() or remove_one(). Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>