summaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/hw
Commit message (Collapse)AuthorAgeFilesLines
* IB/mlx5: Fix use-after-free error while accessing ev_file pointerYishai Hadas2019-08-121-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Call to uverbs_close_fd() releases file pointer to 'ev_file' and mlx5_ib_dev is going to be inaccessible. Cache pointer prior cleaning resources to solve the KASAN warning below. BUG: KASAN: use-after-free in devx_async_event_close+0x391/0x480 [mlx5_ib] Read of size 8 at addr ffff888301e3cec0 by task devx_direct_tes/4631 CPU: 1 PID: 4631 Comm: devx_direct_tes Tainted: G OE 5.3.0-rc1-for-upstream-dbg-2019-07-26_01-19-56-93 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu2 04/01/2014 Call Trace: dump_stack+0x9a/0xeb print_address_description+0x1e2/0x400 ? devx_async_event_close+0x391/0x480 [mlx5_ib] __kasan_report+0x15c/0x1df ? devx_async_event_close+0x391/0x480 [mlx5_ib] kasan_report+0xe/0x20 devx_async_event_close+0x391/0x480 [mlx5_ib] __fput+0x26a/0x7b0 task_work_run+0x10d/0x180 exit_to_usermode_loop+0x137/0x160 do_syscall_64+0x3c7/0x490 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f5df907d664 Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 80 00 00 00 00 8b 05 6a cd 20 00 48 63 ff 85 c0 75 13 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 44 f3 c3 66 90 48 83 ec 18 48 89 7c 24 08 e8 RSP: 002b:00007ffd353cb958 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 RAX: 0000000000000000 RBX: 000056017a88c348 RCX: 00007f5df907d664 RDX: 00007f5df969d400 RSI: 00007f5de8f1ec90 RDI: 0000000000000006 RBP: 00007f5df9681dc0 R08: 00007f5de8736410 R09: 000056017a9d2dd0 R10: 000000000000000b R11: 0000000000000246 R12: 00007f5de899d7d0 R13: 00007f5df96c4248 R14: 00007f5de8f1ecb0 R15: 000056017ae41308 Allocated by task 4631: save_stack+0x19/0x80 kasan_kmalloc.constprop.3+0xa0/0xd0 alloc_uobj+0x71/0x230 [ib_uverbs] alloc_begin_fd_uobject+0x2e/0xc0 [ib_uverbs] rdma_alloc_begin_uobject+0x96/0x140 [ib_uverbs] ib_uverbs_run_method+0xdf0/0x1940 [ib_uverbs] ib_uverbs_cmd_verbs+0x57e/0xdb0 [ib_uverbs] ib_uverbs_ioctl+0x177/0x260 [ib_uverbs] do_vfs_ioctl+0x18f/0x1010 ksys_ioctl+0x70/0x80 __x64_sys_ioctl+0x6f/0xb0 do_syscall_64+0x95/0x490 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 4631: save_stack+0x19/0x80 __kasan_slab_free+0x11d/0x160 slab_free_freelist_hook+0x67/0x1a0 kfree+0xb9/0x2a0 uverbs_close_fd+0x118/0x1c0 [ib_uverbs] devx_async_event_close+0x28a/0x480 [mlx5_ib] __fput+0x26a/0x7b0 task_work_run+0x10d/0x180 exit_to_usermode_loop+0x137/0x160 do_syscall_64+0x3c7/0x490 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at ffff888301e3cda8 which belongs to the cache kmalloc-512 of size 512 The buggy address is located 280 bytes inside of 512-byte region [ffff888301e3cda8, ffff888301e3cfa8) The buggy address belongs to the page: page:ffffea000c078e00 refcount:1 mapcount:0 mapping:ffff888352811300 index:0x0 compound_mapcount: 0 flags: 0x2fffff80010200(slab|head) raw: 002fffff80010200 ffffea000d152608 ffffea000c077808 ffff888352811300 raw: 0000000000000000 0000000000250025 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888301e3cd80: fc fc fc fc fc fb fb fb fb fb fb fb fb fb fb fb ffff888301e3ce00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888301e3ce80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888301e3cf00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888301e3cf80: fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc Disabling lock debugging due to kernel taint Cc: <stable@vger.kernel.org> # 5.2 Fixes: 759738537142 ("IB/mlx5: Enable subscription for device events over DEVX") Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Link: https://lore.kernel.org/r/20190808081538.28772-1-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/mlx5: Check the correct variable in error handling codeDan Carpenter2019-08-071-1/+1
| | | | | | | | | | | The code accidentally checks "event_sub" instead of "event_sub->eventfd". Fixes: 759738537142 ("IB/mlx5: Enable subscription for device events over DEVX") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190807123236.GA11452@mwanda Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/mlx5: Fix implicit MR release flowYishai Hadas2019-08-071-15/+9Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Once implicit MR is being called to be released by ib_umem_notifier_release() its leaves were marked as "dying". However, when dereg_mr()->mlx5_ib_free_implicit_mr()->mr_leaf_free() is called, it skips running the mr_leaf_free_action (i.e. umem_odp->work) when those leaves were marked as "dying". As such ib_umem_release() for the leaves won't be called and their MRs will be leaked as well. When an application exits/killed without calling dereg_mr we might hit the above flow. This fatal scenario is reported by WARN_ON() upon mlx5_ib_dealloc_ucontext() as ibcontext->per_mm_list is not empty, the call trace can be seen below. Originally the "dying" mark as part of ib_umem_notifier_release() was introduced to prevent pagefault_mr() from returning a success response once this happened. However, we already have today the completion mechanism so no need for that in those flows any more. Even in case a success response will be returned the firmware will not find the pages and an error will be returned in the following call as a released mm will cause ib_umem_odp_map_dma_pages() to permanently fail mmget_not_zero(). Fix the above issue by dropping the "dying" from the above flows. The other flows that are using "dying" are still needed it for their synchronization purposes. WARNING: CPU: 1 PID: 7218 at drivers/infiniband/hw/mlx5/main.c:2004 mlx5_ib_dealloc_ucontext+0x84/0x90 [mlx5_ib] CPU: 1 PID: 7218 Comm: ibv_rc_pingpong Tainted: G E 5.2.0-rc6+ #13 Call Trace: uverbs_destroy_ufile_hw+0xb5/0x120 [ib_uverbs] ib_uverbs_close+0x1f/0x80 [ib_uverbs] __fput+0xbe/0x250 task_work_run+0x88/0xa0 do_exit+0x2cb/0xc30 ? __fput+0x14b/0x250 do_group_exit+0x39/0xb0 get_signal+0x191/0x920 ? _raw_spin_unlock_bh+0xa/0x20 ? inet_csk_accept+0x229/0x2f0 do_signal+0x36/0x5e0 ? put_unused_fd+0x5b/0x70 ? __sys_accept4+0x1a6/0x1e0 ? inet_hash+0x35/0x40 ? release_sock+0x43/0x90 ? _raw_spin_unlock_bh+0xa/0x20 ? inet_listen+0x9f/0x120 exit_to_usermode_loop+0x5c/0xc6 do_syscall_64+0x182/0x1b0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 81713d3788d2 ("IB/mlx5: Add implicit MR support") Link: https://lore.kernel.org/r/20190805083010.21777-1-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/hns: Fix error return code in hns_roce_v1_rsv_lp_qp()Wei Yongjun2019-08-011-1/+3
| | | | | | | | | | | | Fix to return error code -ENOMEM from the rdma_zalloc_drv_obj() error handling case instead of 0, as done elsewhere in this function. Fixes: e8ac9389f0d7 ("RDMA: Fix allocation failure on pointer pd") Fixes: 21a428a019c9 ("RDMA: Handle PD allocations by IB/core") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190801012725.150493-1-weiyongjun1@huawei.com Signed-off-by: Doug Ledford <dledford@redhat.com>
* RDMA/mlx5: Release locks during notifier unregisterLeon Romanovsky2019-08-011-4/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The below kernel panic was observed when created bond mode LACP with GRE tunnel on top. The reason to it was not released spinlock during mlx5 notify unregsiter sequence. [ 234.562007] BUG: scheduling while atomic: sh/10900/0x00000002 [ 234.563005] Preemption disabled at: [ 234.566864] ------------[ cut here ]------------ [ 234.567120] DEBUG_LOCKS_WARN_ON(val > preempt_count()) [ 234.567139] WARNING: CPU: 16 PID: 10900 at kernel/sched/core.c:3203 preempt_count_sub+0xca/0x170 [ 234.569550] CPU: 16 PID: 10900 Comm: sh Tainted: G W 5.2.0-rc1-for-linust-dbg-2019-05-25_04-57-33-60 #1 [ 234.569886] Hardware name: Dell Inc. PowerEdge R720/0X3D66, BIOS 2.6.1 02/12/2018 [ 234.570183] RIP: 0010:preempt_count_sub+0xca/0x170 [ 234.570404] Code: 03 38 d0 7c 08 84 d2 0f 85 b0 00 00 00 8b 15 dd 02 03 04 85 d2 75 ba 48 c7 c6 00 e1 88 83 48 c7 c7 40 e1 88 83 e8 76 11 f7 ff <0f> 0b 5b c3 65 8b 05 d3 1f d8 7e 84 c0 75 82 e8 62 c3 c3 00 85 c0 [ 234.570911] RSP: 0018:ffff888b94477b08 EFLAGS: 00010286 [ 234.571133] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 [ 234.571391] RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000246 [ 234.571648] RBP: ffff888ba5560000 R08: fffffbfff08962d5 R09: fffffbfff08962d5 [ 234.571902] R10: 0000000000000001 R11: fffffbfff08962d4 R12: ffff888bac6e9548 [ 234.572157] R13: ffff888babfaf728 R14: ffff888bac6e9568 R15: ffff888babfaf750 [ 234.572412] FS: 00007fcafa59b740(0000) GS:ffff888bed200000(0000) knlGS:0000000000000000 [ 234.572686] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 234.572914] CR2: 00007f984f16b140 CR3: 0000000b2bf0a001 CR4: 00000000001606e0 [ 234.573172] Call Trace: [ 234.573336] _raw_spin_unlock+0x2e/0x50 [ 234.573542] mlx5_ib_unbind_slave_port+0x1bc/0x690 [mlx5_ib] [ 234.573793] mlx5_ib_cleanup_multiport_master+0x1d3/0x660 [mlx5_ib] [ 234.574039] mlx5_ib_stage_init_cleanup+0x4c/0x360 [mlx5_ib] [ 234.574271] ? kfree+0xf5/0x2f0 [ 234.574465] __mlx5_ib_remove+0x61/0xd0 [mlx5_ib] [ 234.574688] ? __mlx5_ib_remove+0xd0/0xd0 [mlx5_ib] [ 234.574951] mlx5_remove_device+0x234/0x300 [mlx5_core] [ 234.575224] mlx5_unregister_device+0x4d/0x1e0 [mlx5_core] [ 234.575493] remove_one+0x4f/0x160 [mlx5_core] [ 234.575704] pci_device_remove+0xef/0x2a0 [ 234.581407] ? pcibios_free_irq+0x10/0x10 [ 234.587143] ? up_read+0xc1/0x260 [ 234.592785] device_release_driver_internal+0x1ab/0x430 [ 234.598442] unbind_store+0x152/0x200 [ 234.604064] ? sysfs_kf_write+0x3b/0x180 [ 234.609441] ? sysfs_file_ops+0x160/0x160 [ 234.615021] kernfs_fop_write+0x277/0x440 [ 234.620288] ? __sb_start_write+0x1ef/0x2c0 [ 234.625512] vfs_write+0x15e/0x460 [ 234.630786] ksys_write+0x156/0x1e0 [ 234.635988] ? __ia32_sys_read+0xb0/0xb0 [ 234.641120] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 234.646163] do_syscall_64+0x95/0x470 [ 234.651106] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 234.656004] RIP: 0033:0x7fcaf9c9cfd0 [ 234.660686] Code: 73 01 c3 48 8b 0d c0 6e 2d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d cd cf 2d 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ee cb 01 00 48 89 04 24 [ 234.670128] RSP: 002b:00007ffd3b01ddd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 234.674811] RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007fcaf9c9cfd0 [ 234.679387] RDX: 000000000000000d RSI: 00007fcafa5c1000 RDI: 0000000000000001 [ 234.683848] RBP: 00007fcafa5c1000 R08: 000000000000000a R09: 00007fcafa59b740 [ 234.688167] R10: 00007ffd3b01d8e0 R11: 0000000000000246 R12: 00007fcaf9f75400 [ 234.692386] R13: 000000000000000d R14: 0000000000000001 R15: 0000000000000000 [ 234.696495] irq event stamp: 153067 [ 234.700525] hardirqs last enabled at (153067): [<ffffffff83258c39>] _raw_spin_unlock_irqrestore+0x59/0x70 [ 234.704665] hardirqs last disabled at (153066): [<ffffffff83259382>] _raw_spin_lock_irqsave+0x22/0x90 [ 234.708722] softirqs last enabled at (153058): [<ffffffff836006c5>] __do_softirq+0x6c5/0xb4e [ 234.712673] softirqs last disabled at (153051): [<ffffffff81227c1d>] irq_exit+0x17d/0x1d0 [ 234.716601] ---[ end trace 5dbf096843ee9ce6 ]--- Fixes: df097a278c75 ("IB/mlx5: Use the new mlx5 core notifier API") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190731083852.584-1-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/hfi1: Fix Spectre v1 vulnerabilityGustavo A. R. Silva2019-08-011-0/+2
| | | | | | | | | | | | | | | | | | sl is controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. Fix this by sanitizing sl before using it to index ibp->sl_to_sc. Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://lore.kernel.org/lkml/20180423164740.GY17484@dhcp22.suse.cz/ Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Link: https://lore.kernel.org/r/20190731175428.GA16736@embeddedor Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/mlx5: Fix MR registration flow to use UMR properlyGuy Levi2019-08-011-18/+9Star
| | | | | | | | | | | | | | Driver shouldn't allow to use UMR to register a MR when umr_modify_atomic_disabled is set. Otherwise it will always end up with a failure in the post send flow which sets the UMR WQE to modify atomic access right. Fixes: c8d75a980fab ("IB/mlx5: Respect new UMR capabilities") Signed-off-by: Guy Levi <guyle@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190731081929.32559-1-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
* RDMA/qedr: Fix the hca_type and hca_rev returned in device attributesMichal Kalderon2019-07-291-2/+8
| | | | | | | | | | | There was a place holder for hca_type and vendor was returned in hca_rev. Fix the hca_rev to return the hw revision and fix the hca_type to return an informative string representing the hca. Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Link: https://lore.kernel.org/r/20190728111338.21930-1-michal.kalderon@marvell.com Signed-off-by: Doug Ledford <dledford@redhat.com>
* RDMA/hns: Fix build errorYueHaibing2019-07-292-9/+5Star
| | | | | | | | | | | | | | | | | | | | | | | | If INFINIBAND_HNS_HIP08 is selected and HNS3 is m, but INFINIBAND_HNS is y, building fails: drivers/infiniband/hw/hns/hns_roce_hw_v2.o: In function `hns_roce_hw_v2_exit': hns_roce_hw_v2.c:(.exit.text+0xd): undefined reference to `hnae3_unregister_client' drivers/infiniband/hw/hns/hns_roce_hw_v2.o: In function `hns_roce_hw_v2_init': hns_roce_hw_v2.c:(.init.text+0xd): undefined reference to `hnae3_register_client' Also if INFINIBAND_HNS_HIP06 is selected and HNS_DSAF is m, but INFINIBAND_HNS is y, building fails: drivers/infiniband/hw/hns/hns_roce_hw_v1.o: In function `hns_roce_v1_reset': hns_roce_hw_v1.c:(.text+0x39fa): undefined reference to `hns_dsaf_roce_reset' hns_roce_hw_v1.c:(.text+0x3a25): undefined reference to `hns_dsaf_roce_reset' Reported-by: Hulk Robot <hulkci@huawei.com> Fixes: dd74282df573 ("RDMA/hns: Initialize the PCI device for hip08 RoCE") Fixes: 08805fdbeb2d ("RDMA/hns: Split hw v1 driver from hns roce driver") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Link: https://lore.kernel.org/r/20190724065443.53068-1-yuehaibing@huawei.com Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/mlx5: Fix RSS Toeplitz setup to be aligned with the HW specificationYishai Hadas2019-07-251-1/+0Star
| | | | | | | | | | | | | | | | | The specification for the Toeplitz function doesn't require to set the key explicitly to be symmetric. In case a symmetric functionality is required a symmetric key can be simply used. Wrongly forcing the algorithm to symmetric causes the wrong packet distribution and a performance degradation. Link: https://lore.kernel.org/r/20190723065733.4899-7-leon@kernel.org Cc: <stable@vger.kernel.org> # 4.7 Fixes: 28d6137008b2 ("IB/mlx5: Add RSS QP support") Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Alex Vainman <alexv@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/mlx5: Prevent concurrent MR updates during invalidationMoni Shoua2019-07-251-1/+2
| | | | | | | | | | | | | | | | | | | | | | The device requires that memory registration work requests that update the address translation table of a MR will be fenced if posted together. This scenario can happen when address ranges are invalidated by the mmu in separate concurrent calls to the invalidation callback. We prefer to block concurrent address updates for a single MR over fencing since making the decision if a WQE needs fencing will be more expensive and fencing all WQEs is a too radical choice. Further, it isn't clear that this code can even run safely concurrently, so a lock is a safer choice. Fixes: b4cfe447d47b ("IB/mlx5: Implement on demand paging by adding support for MMU notifiers") Link: https://lore.kernel.org/r/20190723065733.4899-8-leon@kernel.org Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/mlx5: Fix clean_mr() to work in the expected orderYishai Hadas2019-07-241-3/+3
| | | | | | | | | | | | | | | | | Any dma map underlying the MR should only be freed once the MR is fenced at the hardware. As of the above we first destroy the MKEY and just after that can safely call to dma_unmap_single(). Link: https://lore.kernel.org/r/20190723065733.4899-6-leon@kernel.org Cc: <stable@vger.kernel.org> # 4.3 Fixes: 8a187ee52b04 ("IB/mlx5: Support the new memory registration API") Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/mlx5: Move MRs to a kernel PD when freeing them to the MR cacheYishai Hadas2019-07-241-1/+3
| | | | | | | | | | | | | | | | | | | | | Fix unreg_umr to move the MR to a kernel owned PD (i.e. the UMR PD) which can't be accessed by userspace. This ensures that nothing can continue to access the MR once it has been placed in the kernels cache for reuse. MRs in the cache continue to have their HW state, including DMA tables, present. Even though the MR has been invalidated, changing the PD provides an additional layer of protection against use of the MR. Link: https://lore.kernel.org/r/20190723065733.4899-5-leon@kernel.org Cc: <stable@vger.kernel.org> # 3.10 Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/mlx5: Use direct mkey destroy command upon UMR unreg failureYishai Hadas2019-07-241-5/+8
| | | | | | | | | | | | | | | | | | | | | | | Use a direct firmware command to destroy the mkey in case the unreg UMR operation has failed. This prevents a case that a mkey will leak out from the cache post a failure to be destroyed by a UMR WR. In case the MR cache limit didn't reach a call to add another entry to the cache instead of the destroyed one is issued. In addition, replaced a warn message to WARN_ON() as this flow is fatal and can't happen unless some bug around. Link: https://lore.kernel.org/r/20190723065733.4899-4-leon@kernel.org Cc: <stable@vger.kernel.org> # 4.10 Fixes: 49780d42dfc9 ("IB/mlx5: Expose MR cache for mlx5_ib") Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/mlx5: Fix unreg_umr to ignore the mkey stateYishai Hadas2019-07-243-6/+11
| | | | | | | | | | | | | | | | Fix unreg_umr to ignore the mkey state and do not fail if was freed. This prevents a case that a user space application already changed the mkey state to free and then the UMR operation will fail leaving the mkey in an inappropriate state. Link: https://lore.kernel.org/r/20190723065733.4899-3-leon@kernel.org Cc: <stable@vger.kernel.org> # 3.19 Fixes: 968e78dd9644 ("IB/mlx5: Enhance UMR support to allow partial page table update") Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/mlx5: Replace kfree with kvfreeChuhong Yuan2019-07-221-2/+2
| | | | | | | | | | | Memory allocated by kvzalloc should not be freed by kfree(), use kvfree() instead. Fixes: 813e90b1aeaa ("IB/mlx5: Add advise_mr() support") Link: https://lore.kernel.org/r/20190717082101.14196-1-hslester96@gmail.com Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/bnxt_re: Honor vlan_id in GID entry comparisonSelvin Xavier2019-07-225-13/+30
| | | | | | | | | | | | | | | | | | | | A GID entry consists of GID, vlan, netdev and smac. Extend GID duplicate check comparisons to consider vlan_id as well to support IPv6 VLAN based link local addresses. Introduce a new structure (bnxt_qplib_gid_info) to hold gid and vlan_id information. The issue is discussed in the following thread https://lore.kernel.org/r/AM0PR05MB4866CFEDCDF3CDA1D7D18AA5D1F20@AM0PR05MB4866.eurprd05.prod.outlook.com Fixes: 823b23da7113 ("IB/core: Allow vlan link local address based RoCE GIDs") Cc: <stable@vger.kernel.org> # v5.2+ Link: https://lore.kernel.org/r/20190715091913.15726-1-selvin.xavier@broadcom.com Reported-by: Yi Zhang <yi.zhang@redhat.com> Co-developed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Tested-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/hfi1: Drop all TID RDMA READ RESP packets after r_next_psnKaike Wan2019-07-221-41/+1Star
| | | | | | | | | | | | | | | | | | | | | | | When a TID sequence error occurs while receiving TID RDMA READ RESP packets, all packets after flow->flow_state.r_next_psn should be dropped, including those response packets for subsequent segments. The current implementation will drop the subsequent response packets for the segment to complete next, but may accept packets for subsequent segments and therefore mistakenly advance the r_next_psn fields for the corresponding software flows. This may result in failures to complete subsequent segments after the current segment is completed. The fix is to only use the flow pointed by req->clear_tail for checking KDETH PSN instead of finding a flow from the request's flow array. Fixes: b885d5be9ca1 ("IB/hfi1: Unify the software PSN check for TID RDMA READ/WRITE") Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20190715164540.74174.54702.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/hfi1: Field not zero-ed when allocating TID flow memoryKaike Wan2019-07-221-0/+1
| | | | | | | | | | | | | | | | | | | | | The field flow->resync_npkts is added for TID RDMA WRITE request and zero-ed when a TID RDMA WRITE RESP packet is received by the requester. This field is used to rewind a request during retry in the function hfi1_tid_rdma_restart_req() shared by both TID RDMA WRITE and TID RDMA READ requests. Therefore, when a TID RDMA READ request is retried, this field may not be initialized at all, which causes the retry to start at an incorrect psn, leading to the drop of the retry request by the responder. This patch fixes the problem by zeroing out the field when the flow memory is allocated. Fixes: 838b6fd2d9ca ("IB/hfi1: TID RDMA RcvArray programming and TID allocation") Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20190715164534.74174.6177.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/hfi1: Unreserve a flushed OPFN requestKaike Wan2019-07-221-2/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an OPFN request is flushed, the request is completed without unreserving itself from the send queue. Subsequently, when a new request is post sent, the following warning will be triggered: WARNING: CPU: 4 PID: 8130 at rdmavt/qp.c:1761 rvt_post_send+0x72a/0x880 [rdmavt] Call Trace: [<ffffffffbbb61e41>] dump_stack+0x19/0x1b [<ffffffffbb497688>] __warn+0xd8/0x100 [<ffffffffbb4977cd>] warn_slowpath_null+0x1d/0x20 [<ffffffffc01c941a>] rvt_post_send+0x72a/0x880 [rdmavt] [<ffffffffbb4dcabe>] ? account_entity_dequeue+0xae/0xd0 [<ffffffffbb61d645>] ? __kmalloc+0x55/0x230 [<ffffffffc04e1a4c>] ib_uverbs_post_send+0x37c/0x5d0 [ib_uverbs] [<ffffffffc04e5e36>] ? rdma_lookup_put_uobject+0x26/0x60 [ib_uverbs] [<ffffffffc04dbce6>] ib_uverbs_write+0x286/0x460 [ib_uverbs] [<ffffffffbb6f9457>] ? security_file_permission+0x27/0xa0 [<ffffffffbb641650>] vfs_write+0xc0/0x1f0 [<ffffffffbb64246f>] SyS_write+0x7f/0xf0 [<ffffffffbbb74ddb>] system_call_fastpath+0x22/0x27 This patch fixes the problem by moving rvt_qp_wqe_unreserve() into rvt_qp_complete_swqe() to simplify the code and make it less error-prone. Fixes: ca95f802ef51 ("IB/hfi1: Unreserve a reserved request when it is completed") Link: https://lore.kernel.org/r/20190715164528.74174.31364.stgit@awfm-01.aw.intel.com Cc: <stable@vger.kernel.org> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/hfi1: Check for error on call to alloc_rsm_map_tableJohn Fleck2019-07-221-2/+9
| | | | | | | | | | | | | The call to alloc_rsm_map_table does not check if the kmalloc fails. Check for a NULL on alloc, and bail if it fails. Fixes: 372cc85a13c9 ("IB/hfi1: Extract RSM map table init from QOS") Link: https://lore.kernel.org/r/20190715164521.74174.27047.stgit@awfm-01.aw.intel.com Cc: <stable@vger.kernel.org> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: John Fleck <john.fleck@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/hns: Fix sg offset non-zero issueXi Wang2019-07-221-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When run perftest in many times, the system will report a BUG as follows: BUG: Bad rss-counter state mm:(____ptrval____) idx:0 val:-1 BUG: Bad rss-counter state mm:(____ptrval____) idx:1 val:1 We tested with different kernel version and found it started from the the following commit: commit d10bcf947a3e ("RDMA/umem: Combine contiguous PAGE_SIZE regions in SGEs") In this commit, the sg->offset is always 0 when sg_set_page() is called in ib_umem_get() and the drivers are not allowed to change the sgl, otherwise it will get bad page descriptor when unfolding SGEs in __ib_umem_release() as sg_page_count() will get wrong result while sgl->offset is not 0. However, there is a weird sgl usage in the current hns driver, the driver modified sg->offset after calling ib_umem_get(), which caused we iterate past the wrong number of pages in for_each_sg_page iterator. This patch fixes it by correcting the non-standard sgl usage found in the hns_roce_db_map_user() function. Fixes: d10bcf947a3e ("RDMA/umem: Combine contiguous PAGE_SIZE regions in SGEs") Fixes: 0425e3e6e0c7 ("RDMA/hns: Support flush cqe for hip08 in kernel space") Link: https://lore.kernel.org/r/1562808737-45723-1-git-send-email-oulijun@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* Merge branch 'work.mount0' of ↵Linus Torvalds2019-07-191-9/+17
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs mount updates from Al Viro: "The first part of mount updates. Convert filesystems to use the new mount API" * 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits) mnt_init(): call shmem_init() unconditionally constify ksys_mount() string arguments don't bother with registering rootfs init_rootfs(): don't bother with init_ramfs_fs() vfs: Convert smackfs to use the new mount API vfs: Convert selinuxfs to use the new mount API vfs: Convert securityfs to use the new mount API vfs: Convert apparmorfs to use the new mount API vfs: Convert openpromfs to use the new mount API vfs: Convert xenfs to use the new mount API vfs: Convert gadgetfs to use the new mount API vfs: Convert oprofilefs to use the new mount API vfs: Convert ibmasmfs to use the new mount API vfs: Convert qib_fs/ipathfs to use the new mount API vfs: Convert efivarfs to use the new mount API vfs: Convert configfs to use the new mount API vfs: Convert binfmt_misc to use the new mount API convenience helper: get_tree_single() convenience helper get_tree_nodev() vfs: Kill sget_userns() ...
| * vfs: Convert qib_fs/ipathfs to use the new mount APIDavid Howells2019-07-051-9/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert the qib_fs/ipathfs filesystem to the new internal mount API as the old one will be obsoleted and removed. This allows greater flexibility in communication of mount parameters between userspace, the VFS and the filesystem. See Documentation/filesystems/mount_api.txt for more information. [Q] Can qib_remove() race with qibfs_kill_super()? Should qib_super accesses be serialised with some sort of lock? [A] yes, it can and no, that's not the right solution. See vfs.git #qibfs for an old attempt to handle that cleanly. Infiniband folks were not interested... Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> cc: Mike Marciniszyn <mike.marciniszyn@intel.com> cc: linux-rdma@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds2019-07-16115-21746/+4131Star
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull rdma updates from Jason Gunthorpe: "A smaller cycle this time. Notably we see another new driver, 'Soft iWarp', and the deletion of an ancient unused driver for nes. - Revise and simplify the signature offload RDMA MR APIs - More progress on hoisting object allocation boiler plate code out of the drivers - Driver bug fixes and revisions for hns, hfi1, efa, cxgb4, qib, i40iw - Tree wide cleanups: struct_size, put_user_page, xarray, rst doc conversion - Removal of obsolete ib_ucm chardev and nes driver - netlink based discovery of chardevs and autoloading of the modules providing them - Move more of the rdamvt/hfi1 uapi to include/uapi/rdma - New driver 'siw' for software based iWarp running on top of netdev, much like rxe's software RoCE. - mlx5 feature to report events in their raw devx format to userspace - Expose per-object counters through rdma tool - Adaptive interrupt moderation for RDMA (DIM), sharing the DIM core from netdev" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (194 commits) RMDA/siw: Require a 64 bit arch RDMA/siw: Mark expected switch fall-throughs RDMA/core: Fix -Wunused-const-variable warnings rdma/siw: Remove set but not used variable 's' rdma/siw: Add missing dependencies on LIBCRC32C and DMA_VIRT_OPS RDMA/siw: Add missing rtnl_lock around access to ifa rdma/siw: Use proper enumerated type in map_cqe_status RDMA/siw: Remove unnecessary kthread create/destroy printouts IB/rdmavt: Fix variable shadowing issue in rvt_create_cq RDMA/core: Fix race when resolving IP address RDMA/core: Make rdma_counter.h compile stand alone IB/core: Work on the caller socket net namespace in nldev_newlink() RDMA/rxe: Fill in wc byte_len with IB_WC_RECV_RDMA_WITH_IMM RDMA/mlx5: Set RDMA DIM to be enabled by default RDMA/nldev: Added configuration of RDMA dynamic interrupt moderation to netlink RDMA/core: Provide RDMA DIM support for ULPs linux/dim: Implement RDMA adaptive moderation (DIM) IB/mlx5: Report correctly tag matching rendezvous capability docs: infiniband: add it to the driver-api bookset IB/mlx5: Implement VHCA tunnel mechanism in DEVX ...
| * | RDMA/mlx5: Set RDMA DIM to be enabled by defaultLeon Romanovsky2019-07-081-0/+2
| | | | | | | | | | | | | | | | | | | | | Enable RDMA DIM by default for better user experience. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | IB/mlx5: Report correctly tag matching rendezvous capabilityDanit Goldberg2019-07-081-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Userspace expects the IB_TM_CAP_RC bit to indicate that the device supports RC transport tag matching with rendezvous offload. However the firmware splits this into two capabilities for eager and rendezvous tag matching. Only if the FW supports both modes should userspace be told the tag matching capability is available. Cc: <stable@vger.kernel.org> # 4.13 Fixes: eb761894351d ("IB/mlx5: Fill XRQ capabilities") Signed-off-by: Danit Goldberg <danitg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | Merge branch 'vhca-tunnel' into rdma.git for-nextJason Gunthorpe2019-07-081-4/+20
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Max Gurtovoy says: ==================== Those two patches introduce VHCA tunnel mechanism to DEVX interface needed for Bluefield SOC. See extensive commit messages for more information. ==================== Based on the mlx5-next branch from git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux for dependencies * branch 'vcha-tunnel': IB/mlx5: Implement VHCA tunnel mechanism in DEVX net/mlx5: Introduce VHCA tunnel device capability
| | * | IB/mlx5: Implement VHCA tunnel mechanism in DEVXMax Gurtovoy2019-07-081-4/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This mechanism will allow function-A to perform operations "on behalf" of function-B via tunnel object. Function-A will have privileges for creating and using this tunnel object. For example, in the device emulation feature presented in Bluefield-1 SoC, using device emulation capability, one can present NVMe function to the host OS. Since the NVMe function doesn't have a normal command interface to the HCA HW, here is a need to create a channel that will be able to issue commands "on behalf" of this function. This channel is the VHCA_TUNNEL general object. The emulation software will create this tunnel for every managed function and issue commands via devx general cmd interface using the appropriate tunnel ID. When devX context will receive a command with non-zero vhca_tunnel_id, it will pass the command as-is down to the HCA. All the validation, security and resource tracking of the commands and the created tunneled objects is in the responsibility of the HCA FW. When a VHCA_TUNNEL object destroyed, the device will issue an internal FLR (function level reset) to the emulated function associated with this tunnel. This will destroy all the created resources using the tunnel mechanism. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/hns: Clean up unnecessary variable initializationLang Cheng2019-07-074-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Here Clean up unnecessary initial value for some variable. Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/hns: Fixs hw access invalid dma memory errorXi Wang2019-07-071-1/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When smmu is enable, if execute the perftest command and then use 'kill -9' to exit, follow this operation repeatedly, the kernel will have a high probability to print the following smmu event: arm-smmu-v3 arm-smmu-v3.1.auto: event 0x10 received: arm-smmu-v3 arm-smmu-v3.1.auto: 0x00007d0000000010 arm-smmu-v3 arm-smmu-v3.1.auto: 0x0000020900000080 arm-smmu-v3 arm-smmu-v3.1.auto: 0x00000000f47cf000 arm-smmu-v3 arm-smmu-v3.1.auto: 0x00000000f47cf000 This is because the hw will periodically refresh the qpc cache until the next reset. This patch fixed it by removing the action that release qpc memory in the 'hns_roce_qp_free' function. Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver") Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/hns: Use %pK format pointer printLang Cheng2019-07-071-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The format specifier \"%p\" can leak kernel addresses. Use \"%pK\" instead. Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/hns: Bugfix for calculating qp buffer sizeLijun Ou2019-07-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The buffer size of qp which used to allocate qp buffer space for storing sqwqe and rqwqe will be the length of buffer space. The kernel driver will use the buffer address and the same size to get the user memory. The same size named buff_size of qp. According the algorithm of calculating, The size of the two is not equal when users set the max sge of sq. Fixes: b28ca7cceff8 ("RDMA/hns: Limit extend sq sge num") Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/hns: Set reset flag when hw resettingLang Cheng2019-07-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When hw resetting, there is no response from hw when driver sending cmdq. If driver still send cmdq to hw, the reset process may be blocked. So reset flag should be set to intercept the cmdq command when driver receiving "notify down" signal. Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/hns: Modify ba page size for cqeYangyang Li2019-07-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the depth of cq only supports 64K. According to the UM, the depth of cq is up to 4M, Therefore the ba page size of cqe was modified to support the maximum specification of cq depth. Signed-off-by: Yangyang Li <liyangyang20@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/hns: Fixup qp release bugchenglang2019-07-052-6/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hip06 reserve 12 qps, Hip08 reserve 8 qps. When the QP is released, the chip model is not judged, and the Hip08 cannot release the qpn 8~12 Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/hns: Bugfix for cleaning mtrLijun Ou2019-07-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It uses hns_roce_mtr_init in hns_roce_create_qp_common function. As a result, it should use hns_roce_mtr_cleanup function for cleaning mtr when destroying qp. Fixes: 8d18ad83f19b ("RDMA/hns: Fix bug when wqe num is larger than 16K") Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | IB/mlx5: Add counter_alloc_stats() and counter_update_stats() supportMark Zhang2019-07-051-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for ib callback counter_alloc_stats() and counter_update_stats(). Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | IB/mlx5: Support statistic q counter configurationMark Zhang2019-07-051-0/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for ib callbacks counter_bind_qp(), counter_unbind_qp() and counter_dealloc(). Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | IB/mlx5: Add counter set id as a parameter for mlx5_ib_query_q_counters()Mark Zhang2019-07-051-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add counter set id as a parameter so that this API can be used for querying any q counter. Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | IB/mlx5: Support set qp counterMark Zhang2019-07-052-2/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Support bind a qp with counter. If counter is null then bind the qp to the default counter. Different QP state has different operation: - RESET: Set the counter field so that it will take effective during RST2INIT change; - RTS: Issue an RTS2RTS change to update the QP counter; - Other: Set the counter field and mark the counter_pending flag, when QP is moved to RTS state and this flag is set, then issue an RTS2RTS modification to update the counter. Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | Merge mlx5-next into rdma for-nextJason Gunthorpe2019-07-051-1/+1
| |\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Required for dependencies in the next patches. * mlx5-next: net/mlx5: Add rts2rts_qp_counters_set_id field in hca cap net/mlx5: Properly name the generic WQE control field net/mlx5: Introduce TLS TX offload hardware bits and structures net/mlx5: Refactor mlx5_esw_query_functions for modularity net/mlx5: E-Switch prepare functions change handler to be modular net/mlx5: Introduce and use mlx5_eswitch_get_total_vports()
| * | | RDMA/efa: Entropy in admin commands idDaniel Kranzdorf2019-07-041-21/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make admin commands id easier to distinguish by using relevant bits from the producer counter. This allows us to differentiate admin commands with the same producer index (happens after admin queue overlap), which is helpful when debugging. Signed-off-by: Daniel Kranzdorf <dkkranzd@amazon.com> Reviewed-by: Firas JahJah <firasj@amazon.com> Reviewed-by: Yossi Leybovich <sleybo@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/mlx5: Use proper allocation API to get zeroed memoryLeon Romanovsky2019-07-041-2/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no need in custom memory zeroing, because it can be done by using kzalloc from the beginning. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/hns: Fix building modular hnsLijun Ou2019-07-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch below wasn't fully tested for all combinations of module and configs, and causes a compile failure: WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_ah.o see include/linux/module.h for more information WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_alloc.o see include/linux/module.h for more information WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_cmd.o see include/linux/module.h for more information WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_cq.o see include/linux/module.h for more information WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_db.o see include/linux/module.h for more information WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_hem.o see include/linux/module.h for more information WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_mr.o see include/linux/module.h for more information WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_pd.o see include/linux/module.h for more information WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_qp.o see include/linux/module.h for more information WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_restrack.o see include/linux/module.h for more information WARNING: modpost: missing MODULE_LICENSE() in drivers/infiniband/hw/hns/hns_roce_srq.o see include/linux/module.h for more information ERROR: "hns_roce_bitmap_cleanup" [drivers/infiniband/hw/hns/hns_roce_srq.ko] undefined! ERROR: "hns_roce_bitmap_init" [drivers/infiniband/hw/hns/hns_roce_srq.ko] undefined! ERROR: "hns_roce_free_cmd_mailbox" [drivers/infiniband/hw/hns/hns_roce_srq.ko] undefined! ERROR: "hns_roce_alloc_cmd_mailbox" [drivers/infiniband/hw/hns/hns_roce_srq.ko] undefined! ERROR: "hns_roce_table_get" [drivers/infiniband/hw/hns/hns_roce_srq.ko] undefined! ERROR: "hns_roce_bitmap_alloc" [drivers/infiniband/hw/hns/hns_roce_srq.ko] undefined! ERROR: "hns_roce_table_find" [drivers/infiniband/hw/hns/hns_roce_srq.ko] undefined! The fix is to put the module sub components in the right line. Fixes: e9816ddf2a33 ("RDMA/hns: Cleanup unnecessary exported symbols") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | IB/mlx5: DEVX cleanup mdevYishai Hadas2019-07-031-10/+8Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No need any more to hold mlx5_core_dev on the devx_object, it can be accessed from ib_dev. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | IB/mlx5: Add DEVX support for CQ eventsYishai Hadas2019-07-031-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add DEVX support for CQ events by creating and destroying the CQ via mlx5_core and set an handler to manage its completions. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | IB/mlx5: Implement DEVX dispatching eventYishai Hadas2019-07-031-3/+300
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement DEVX dispatching event by looking up for the applicable subscriptions for the reported event and using their target fd to signal/set the event. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | IB/mlx5: Enable subscription for device events over DEVXYishai Hadas2019-07-031-7/+553
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enable subscription for device events over DEVX. Each subscription is added to the two level xarray data structure according to its event number and the DEVX object information in case was given with the given target fd. Those events will be reported over the given fd once will occur. Downstream patches will mange the dispatching to any subscription. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | IB/mlx5: Register DEVX with mlx5_core to get async eventsYishai Hadas2019-07-033-2/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Register DEVX with with mlx5_core to get async events. This will enable to dispatch the applicable events to its consumers in down stream patches. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>