summaryrefslogtreecommitdiffstats
path: root/drivers/net/ethernet
Commit message (Collapse)AuthorAgeFilesLines
* net: tulip: Constify tulip_tblKees Cook2017-09-092-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It looks like all users of tulip_tbl are reads, so mark this table as read-only. $ git grep tulip_tbl # edited to avoid line-wraps... interrupt.c: iowrite32(tulip_tbl[tp->chip_id].valid_intrs, ... interrupt.c: iowrite32(tulip_tbl[tp->chip_id].valid_intrs&~RxPollInt, ... interrupt.c: iowrite32(tulip_tbl[tp->chip_id].valid_intrs, ... interrupt.c: iowrite32(tulip_tbl[tp->chip_id].valid_intrs | TimerInt, pnic.c: iowrite32(tulip_tbl[tp->chip_id].valid_intrs, ioaddr + CSR7); tulip.h: extern struct tulip_chip_table tulip_tbl[]; tulip_core.c:struct tulip_chip_table tulip_tbl[] = { tulip_core.c:iowrite32(tulip_tbl[tp->chip_id].valid_intrs, ioaddr + CSR5); tulip_core.c:iowrite32(tulip_tbl[tp->chip_id].valid_intrs, ioaddr + CSR7); tulip_core.c:setup_timer(&tp->timer, tulip_tbl[tp->chip_id].media_timer, tulip_core.c:const char *chip_name = tulip_tbl[chip_idx].chip_name; tulip_core.c:if (pci_resource_len (pdev, 0) < tulip_tbl[chip_idx].io_size) tulip_core.c:ioaddr = pci_iomap(..., tulip_tbl[chip_idx].io_size); tulip_core.c:tp->flags = tulip_tbl[chip_idx].flags; tulip_core.c:setup_timer(&tp->timer, tulip_tbl[tp->chip_id].media_timer, tulip_core.c:INIT_WORK(&tp->media_work, tulip_tbl[tp->chip_id].media_task); Cc: "David S. Miller" <davem@davemloft.net> Cc: Jarod Wilson <jarod@redhat.com> Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com> Cc: netdev@vger.kernel.org Cc: linux-parisc@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: ethernet: ti: netcp_core: no need in netif_napi_delIvan Khoronzhuk2017-09-091-1/+0Star
| | | | | | | | Don't remove rx_napi specifically just before free_netdev(), it's supposed to be done in it and is confusing w/o tx_napi deletion. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* davicom: Display proper debug level up to 6Mathieu Malaterre2017-09-091-1/+1
| | | | | | | | This will make it explicit some messages are of the form: dm9000_dbg(db, 5, ... Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds2017-09-06405-6506/+48243
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking updates from David Miller: 1) Support ipv6 checksum offload in sunvnet driver, from Shannon Nelson. 2) Move to RB-tree instead of custom AVL code in inetpeer, from Eric Dumazet. 3) Allow generic XDP to work on virtual devices, from John Fastabend. 4) Add bpf device maps and XDP_REDIRECT, which can be used to build arbitrary switching frameworks using XDP. From John Fastabend. 5) Remove UFO offloads from the tree, gave us little other than bugs. 6) Remove the IPSEC flow cache, from Florian Westphal. 7) Support ipv6 route offload in mlxsw driver. 8) Support VF representors in bnxt_en, from Sathya Perla. 9) Add support for forward error correction modes to ethtool, from Vidya Sagar Ravipati. 10) Add time filter for packet scheduler action dumping, from Jamal Hadi Salim. 11) Extend the zerocopy sendmsg() used by virtio and tap to regular sockets via MSG_ZEROCOPY. From Willem de Bruijn. 12) Significantly rework value tracking in the BPF verifier, from Edward Cree. 13) Add new jump instructions to eBPF, from Daniel Borkmann. 14) Rework rtnetlink plumbing so that operations can be run without taking the RTNL semaphore. From Florian Westphal. 15) Support XDP in tap driver, from Jason Wang. 16) Add 32-bit eBPF JIT for ARM, from Shubham Bansal. 17) Add Huawei hinic ethernet driver. 18) Allow to report MD5 keys in TCP inet_diag dumps, from Ivan Delalande. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1780 commits) i40e: point wb_desc at the nvm_wb_desc during i40e_read_nvm_aq i40e: avoid NVM acquire deadlock during NVM update drivers: net: xgene: Remove return statement from void function drivers: net: xgene: Configure tx/rx delay for ACPI drivers: net: xgene: Read tx/rx delay for ACPI rocker: fix kcalloc parameter order rds: Fix non-atomic operation on shared flag variable net: sched: don't use GFP_KERNEL under spin lock vhost_net: correctly check tx avail during rx busy polling net: mdio-mux: add mdio_mux parameter to mdio_mux_init() rxrpc: Make service connection lookup always check for retry net: stmmac: Delete dead code for MDIO registration gianfar: Fix Tx flow control deactivation cxgb4: Ignore MPS_TX_INT_CAUSE[Bubble] for T6 cxgb4: Fix pause frame count in t4_get_port_stats cxgb4: fix memory leak tun: rename generic_xdp to skb_xdp tun: reserve extra headroom only when XDP is set net: dsa: bcm_sf2: Configure IMP port TC2QOS mapping net: dsa: bcm_sf2: Advertise number of egress queues ...
| * Merge branch '40GbE' of ↵David S. Miller2017-09-062-40/+61
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2017-09-05 This series contains fixes for i40e only. These two patches fix an issue where our nvmupdate tool does not work on RHEL 7.4 and newer kernels, in fact, the use of the nvmupdate tool on newer kernels can cause the cards to be non-functional unless these patches are applied. Anjali reworks the locking around accessing the NVM so that NVM acquire timeouts do not occur which was causing the failed firmware updates. Jake correctly updates the wb_desc when reading the NVM through the AdminQ. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * i40e: point wb_desc at the nvm_wb_desc during i40e_read_nvm_aqJacob Keller2017-09-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When introducing the functions to read the NVM through the AdminQ, we did not correctly mark the wb_desc. Fixes: 7073f46e443e ("i40e: Add AQ commands for NVM Update for X722", 2015-06-05) Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| | * i40e: avoid NVM acquire deadlock during NVM updateAnjali Singhai Jain2017-09-062-40/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | X722 devices use the AdminQ to access the NVM, and this requires taking the AdminQ lock. Because of this, we lock the AdminQ during i40e_read_nvm(), which is also called in places where the lock is already held, such as the firmware update path which wants to lock once and then unlock when finished after performing several tasks. Although this should have only affected X722 devices, commit 96a39aed25e6 ("i40e: Acquire NVM lock before reads on all devices", 2016-12-02) added locking for all NVM reads, regardless of device family. This resulted in us accidentally causing NVM acquire timeouts on all devices, causing failed firmware updates which left the eeprom in a corrupt state. Create unsafe non-locked variants of i40e_read_nvm_word and i40e_read_nvm_buffer, __i40e_read_nvm_word and __i40e_read_nvm_buffer respectively. These variants will not take the NVM lock and are expected to only be called in places where the NVM lock is already held if needed. Since the only caller of i40e_read_nvm_buffer() was in such a path, remove it entirely in favor of the unsafe version. If necessary we can always add it back in the future. Additionally, we now need to hold the NVM lock in i40e_validate_checksum because the call to i40e_calc_nvm_checksum now assumes that the NVM lock is held. We can further move the call to read I40E_SR_SW_CHECKSUM_WORD up a bit so that we do not need to acquire the NVM lock twice. This should resolve firmware updates and also fix potential raise that could have caused the driver to report an invalid NVM checksum upon driver load. Reported-by: Stefan Assmann <sassmann@kpanic.de> Fixes: 96a39aed25e6 ("i40e: Acquire NVM lock before reads on all devices", 2016-12-02) Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | drivers: net: xgene: Remove return statement from void functionIyappan Subramanian2017-09-051-2/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 183db4 ("drivers: net: xgene: Correct probe sequence handling") changed the return type of xgene_enet_check_phy_handle() to void. This patch, removes the return statement from the last line. Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | drivers: net: xgene: Configure tx/rx delay for ACPIQuan Nguyen2017-09-051-5/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | This patch fixes configuring tx/rx delay values for ACPI. Signed-off-by: Quan Nguyen <qnguyen@apm.com> Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | drivers: net: xgene: Read tx/rx delay for ACPIIyappan Subramanian2017-09-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | This patch fixes reading tx/rx delay values for ACPI. Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Signed-off-by: Quan Nguyen <qnguyen@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | rocker: fix kcalloc parameter orderZahari Doychev2017-09-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function calls to kcalloc use wrong parameter order and incorrect flags values. GFP_KERNEL is used instead of flags now and the order is corrected. The change was done using the following coccinelle script: @@ expression E1,E2; type T; @@ -kcalloc(E1, E2, sizeof(T)) +kcalloc(E2, sizeof(T), GFP_KERNEL) Signed-off-by: Zahari Doychev <zahari.doychev@linux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: stmmac: Delete dead code for MDIO registrationRomain Perier2017-09-051-16/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This code is no longer used, the logging function was changed by commit fbca164776e4 ("net: stmmac: Use the right logging function in stmmac_mdio_register"). It was previously showing information about the type of the IRQ, if it's polled, ignored or a normal interrupt. As we don't want information loss, I have moved this code to phy_attached_print(). Fixes: fbca164776e4 ("net: stmmac: Use the right logging function in stmmac_mdio_register") Signed-off-by: Romain Perier <romain.perier@collabora.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | gianfar: Fix Tx flow control deactivationClaudiu Manoil2017-09-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The wrong register is checked for the Tx flow control bit, it should have been maccfg1 not maccfg2. This went unnoticed for so long probably because the impact is hardly visible, not to mention the tangled code from adjust_link(). First, link flow control (i.e. handling of Rx/Tx link level pause frames) is disabled by default (needs to be enabled via 'ethtool -A'). Secondly, maccfg2 always returns 0 for tx_flow_oldval (except for a few old boards), which results in Tx flow control remaining always on once activated. Fixes: 45b679c9a3ccd9e34f28e6ec677b812a860eb8eb ("gianfar: Implement PAUSE frame generation support") Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | cxgb4: Ignore MPS_TX_INT_CAUSE[Bubble] for T6Ganesh Goudar2017-09-051-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | MPS_TX_INT_CAUSE[Bubble] is a normal condition for T6, hence ignore this interrupt for T6. Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | cxgb4: Fix pause frame count in t4_get_port_statsGanesh Goudar2017-09-051-8/+4Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MPS_STAT_CTL[CountPauseStatTx] and MPS_STAT_CTL[CountPauseStatRx] only control whether or not Pause Frames will be counted as part of the 64-Byte Tx/Rx Frame counters. These bits do not control whether Pause Frames are counted in the Total Tx/Rx Frames/Bytes counters. Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | cxgb4: fix memory leakGanesh Goudar2017-09-051-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | do not reuse the loop counter which is used iterate over the ports, so that sched_tbl will be freed for all the ports. Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net/mlx4_core: Use ARRAY_SIZE macroThomas Meyer2017-09-052-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use ARRAY_SIZE macro, rather than explicitly coding some variant of it yourself. Found with: find -type f -name "*.c" -o -name "*.h" | xargs perl -p -i -e 's/\bsizeof\s*\(\s*(\w+)\s*\)\s*\ /\s*sizeof\s*\(\s*\1\s*\[\s*0\s*\]\s*\) /ARRAY_SIZE(\1)/g' and manual check/verification. Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: qualcomm: rmnet: Rename real_dev_info to portSubash Abhinov Kasiviswanathan2017-09-048-87/+82Star
| | | | | | | | | | | | | | | | | | | | | | | | Make it similar to drivers like ipvlan / macvlan so it is easier to read. Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Cc: Dan Williams <dcbw@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: qualcomm: rmnet: Implement ndo_get_iflinkSubash Abhinov Kasiviswanathan2017-09-044-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | This makes it easier to find out the parent dev. Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Cc: Dan Williams <dcbw@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: qualcomm: rmnet: Refactor the new rmnet dev creationSubash Abhinov Kasiviswanathan2017-09-043-84/+26Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Data format can be directly set from rmnet_newlink() since the rmnet real dev info is already available. Since __rmnet_get_real_dev_info() is no longer used in rmnet_config.c after removal of those functions, move content to rmnet_get_real_dev_info(). __rmnet_set_endpoint_config() is collapsed into rmnet_set_endpoint_config() since only mux_id was being set additionally within it. Remove an unnecessary mux_id check. Set the mux_id for the rmnet_dev within rmnet_vnd_newlink() itself. Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Cc: Dan Williams <dcbw@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: qualcomm: rmnet: Move the device creation logSubash Abhinov Kasiviswanathan2017-09-041-2/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current log is not very useful as it does not log the device name since it it is prior to registration - (unnamed net_device) (uninitialized): Setting up device Modify to log after the device registration - rmnet1: rmnet dev created Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: qualcomm: rmnet: Remove the unused endpoint -1Subash Abhinov Kasiviswanathan2017-09-041-11/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was used only in the original patch series where the IOCTLs were present and is no longer in use. Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation") Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Cc: Dan Williams <dcbw@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: qualcomm: rmnet: Fix memory corruption if mux_id is greater than 32Subash Abhinov Kasiviswanathan2017-09-043-4/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rmnet_rtnl_validate() was checking for upto mux_id 254, however the rmnet_devices devices could hold upto 32 entries only. Fix this by increasing the size of the rmnet_devices. Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation") Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Cc: Dan Williams <dcbw@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | nfp: flower: restore RTNL locking around representor updatesJakub Kicinski2017-09-041-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we moved to updating representors from a workqueue grabbing the RTNL somehow got lost in the process. Restore it, and make sure RCU lock is not held while we are grabbing the RTNL. RCU protects the representor table, so since we will be under RTNL we can drop RCU lock as soon as we find the netdev pointer. RTNL is needed for the dev_set_mtu() call. Fixes: 2dff19622421 ("nfp: process MTU updates from firmware flower app") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | nfp: build the flower offload by defaultJakub Kicinski2017-09-041-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's reasonable to assume that if user selects to build the NFP driver all offload capabilities will be enabled by default. Change the CONFIG_NFP_APP_FLOWER to default to enabled. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | nfp: be drop monitor friendlyJakub Kicinski2017-09-042-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use dev_consume_skb_any() in place of dev_kfree_skb_any() when control frame has been successfully processed in flower and on the driver's main TX completion path. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | nfp: move the start/stop app callbacks backJakub Kicinski2017-09-041-15/+11Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since representors are now created with a separate callback start/stop app callbacks can be moved again to their original location. They are intended to app-specific init/clean up over the control channel. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | nfp: flower: base lifetime of representors on existence of lower vNICJakub Kicinski2017-09-041-23/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Create representors after lower vNIC is registered and destroy them before it is destroyed. Move the code out of start/stop callbacks directly into vnic_init/clean callbacks. Make sure SR-IOV callbacks don't try to create representors when lower device does not exist. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | nfp: separate app vNIC init/clean from alloc/freeJakub Kicinski2017-09-049-27/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently only have one app callback for vNIC creation and destruction. This is insufficient, because some actions have to be taken before netdev is registered, after it's registered and after it's unregistered. Old callbacks were really corresponding to alloc/free actions. Rename them and add proper init/clean. Apps using representors will be able to use new callbacks to manage lifetime of upper devices. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Merge tag 'mlx5-updates-2017-09-03' of ↵David S. Miller2017-09-047-227/+215Star
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2017-09-03 This series from Tariq includes micro data path optimization for mlx5e netdevice driver. Mainly Tariq introduces the following changes to NAPI and RX handling path of the driver: - RX ring structure reorganizing - Trivial code refactoring and optimization - NAPI busy-poll for when fast UMR is in progress - Non-atomic state operations in NAPI context - Remove unnecessary fields from fast path structures - page-cache micro optimization - Rely on NAPI to avoid missing an IRQ for RX/TX shared NAPI contexts - Stop NAPI when irq changes affinity - Distribute RSS table among all RX rings ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | net/mlx5e: Distribute RSS table among all RX ringsTariq Toukan2017-09-033-17/+4Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In default, uniformly distribute the RSS indirection table entries among all RX rings, rather than restricting this only to the rings on the close NUMA node. irqbalancer would anyway dynamically override the default affinities set to the RX rings. This gives better multi-stream performance and CPU util. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | net/mlx5e: Stop NAPI when irq balancer changes affinityTariq Toukan2017-09-033-2/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NAPI context keeps rescheduling on same CPU as long as it's busy. This doesn't give the oppurtunity for changes in irq affinities to take effect. Fix that by calling napi_complete_done() upon a change in affinity. This would stop the NAPI and reschedule it on the new CPU. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | net/mlx5e: Use kernel's mechanism to avoid missing NAPIsTariq Toukan2017-09-033-15/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We used a channel state bit MLX5E_CHANNEL_NAPI_SCHED to make sure no NAPI is missed when a channel's napi_schedule() is called for completion events of the different channel's resources/rings while NAPI is currently running. Now, as similar mechanism is implemented in kernel, ("39e6c8208d7b net: solve a NAPI race"), we obsolete our own implementation and rely on the return value of napi_complete_done(). This patch removes a redundant overhead of atomic bit operations. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | net/mlx5e: Slightly increase RX page-cache sizeTariq Toukan2017-09-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In XDP_TX flow, we now get back quicker to each page in page-cache, and on some occasions refcount does not get back to 1 on time, causing some costly page allocations. Slightly increase the size of RX page-cache to significantly decrease the chances for this to happen. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | net/mlx5e: Don't recycle page if moved to far NUMATariq Toukan2017-09-033-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid recycling an RX page if it moved to another NUMA node. Add an ethtool counter to count such events. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | net/mlx5e: Remove unnecessary fields in ICO SQTariq Toukan2017-09-033-26/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As of current design, in each NAPI, only a single UMR WQE completion could be available in the completion queue of the the internal control operations (ICO) send queue, in addition to nop operations that require no actions upon completion. This renders the consume index obsolete, as the wqe_counter field in CQE is sufficient. This helps removing a memory barrier, and obsoletes the need for tracking the num_wqebbs to update the consumer counter. In addition, remove other unused fields in icosq struct: pdev, dma_fifo_pc, and prev_cc. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | net/mlx5e: Type-specific optimizations for RX post WQEs functionTariq Toukan2017-09-034-87/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Separate the RX post WQEs function of the different RQ types. This enables RQ type-specific optimizations in data-path. Poll the ICOSQ completion queue only for Striding RQ, and only when a UMR post completion could be possibly available. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | net/mlx5e: Non-atomic RQ state indicator for UMR WQE in progressTariq Toukan2017-09-033-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The indication for a UMR WQE in progress is needed only within the NAPI context, and hence no races possible and no need for the use of atomic operations. The only place the flag is read outside of NAPI context is in closure flow, after RQ is disabled flag is no more accessed in NAPI. Use a boolean instead of a bit in ring state, so that its non-atomic set operations do not race with the atomic sets of the other bits. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | net/mlx5e: Non-atomic indicator for ring enabled stateTariq Toukan2017-09-035-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rings enabled state change occurs in control path only, and is always followed by a napi_sychronize(), so that following NAPIs read the new value. This read does not need to be atomic. The RQ auto-moderation bit is not set/cleared in data-path. No need for atomic read, a regular read operation is sufficient. In RQ creation time as well, there's no multiple threads trying to access it yet, hence a regular read can be used. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | net/mlx5e: Refactor data-path lro header functionTariq Toukan2017-09-031-25/+20Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Refactor function mlx5e_lro_update_hdr() to reduce number of branches. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | net/mlx5e: Early-return on empty completion queuesTariq Toukan2017-09-032-24/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NAPI context handles different kinds of completion queues (RX, TX, and others). Hence, upon a poll trial, some of them might be empty. Here we early-return upon empty completion queues, as well as full rx buffer, and save unnecessary logic and memory barriers. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | net/mlx5e: NAPI busy-poll when UMR post is in progressTariq Toukan2017-09-031-5/+4Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a UMR post is in progress, it means that there's a missing WQE in RQ, and that a completion will be shortly available in ICO SQ completion queue. Prefer busy-poll to handle it as soon as possible. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | net/mlx5e: Small enhancements for RX MPWQE allocation and freeTariq Toukan2017-09-032-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The dma offset of a MPWQE (Multi-Packet WQE) in memory region is fixed for all rounds. Calculate it once on creation time, instead of in runtime. This also obsoletes the wqe argument in the function. In addition, optimize dma_info iterator calculation. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | net/mlx5e: Use memset to init skbs_frags array to zerosTariq Toukan2017-09-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In RX data-path, use memset() instead of loop assignment to init the whole skbs_frags array. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | net/mlx5e: Remove unnecessary wqe_sz field from RQ bufferTariq Toukan2017-09-032-6/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Field is used only locally within the RQ create function. The use of a local variable is sufficient. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | net/mlx5e: Replace multiplication by stride size with a shiftTariq Toukan2017-09-033-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In RX data-path, use shift operations instead of a regular multiplication by stride size, as it is a power of two. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | net/mlx5e: Reorganize struct mlx5e_rqTariq Toukan2017-09-033-19/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bring fast-path fields together, and combine RX WQE mutual exclusive fields into a union. Page-reuse and XDP are mutually exclusive and cannot be used at the same time. Use a union to combine their footprints. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | mlxsw: spectrum_router: Support GRE tunnelsPetr Machata2017-09-044-0/+207
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces callbacks and tunnel type to offload GRE tunnels. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | mlxsw: spectrum_router: Add loopback accessorsPetr Machata2017-09-042-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct mlxsw_sp_rif is a router-private structure, and therefore everything related to it is as well: parameters, and derived RIF types including loopbacks. IPIP module needs access to some details of loopback interfaces, but exporting all the RIF shebang would create too large an interface. So instead export just the bare minimum necessary: accessors for RIF index and underlay VRF ID. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | mlxsw: spectrum: Register for IPIP_DECAP_ERROR trapPetr Machata2017-09-042-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These traps are generated for packets that fail checks for source IP, encapsulation type, or GRE key. Trap these packets to CPU for follow-up handling by the kernel, which will send ICMP destination unreachable responses. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>