| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|\ \ \ \
| |/ / /
|/| | /
| | |/
| |/|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Jason writes:
"Second RDMA rc pull request
- Fix a long standing race bug when destroying comp_event file descriptors
- srp, hfi1, bnxt_re: Various driver crashes from missing validation
and other cases
- Fixes for regressions in patches merged this window in the gid
cache, devx, ucma and uapi."
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/core: Set right entry state before releasing reference
IB/mlx5: Destroy the DEVX object upon error flow
IB/uverbs: Free uapi on destroy
RDMA/bnxt_re: Fix system crash during RDMA resource initialization
IB/hfi1: Fix destroy_qp hang after a link down
IB/hfi1: Fix context recovery when PBC has an UnsupportedVL
IB/hfi1: Invalid user input can result in crash
IB/hfi1: Fix SL array bounds check
RDMA/uverbs: Fix validity check for modify QP
IB/srp: Avoid that sg_reset -d ${srp_device} triggers an infinite loop
ucma: fix a use-after-free in ucma_resolve_ip()
RDMA/uverbs: Atomically flush and mark closed the comp event queue
cxgb4: fix abort_req_rss6 struct
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
rvt_destroy_qp() cannot complete until all in process packets have
been released from the underlying hardware. If a link down event
occurs, an application can hang with a kernel stack similar to:
cat /proc/<app PID>/stack
quiesce_qp+0x178/0x250 [hfi1]
rvt_reset_qp+0x23d/0x400 [rdmavt]
rvt_destroy_qp+0x69/0x210 [rdmavt]
ib_destroy_qp+0xba/0x1c0 [ib_core]
nvme_rdma_destroy_queue_ib+0x46/0x80 [nvme_rdma]
nvme_rdma_free_queue+0x3c/0xd0 [nvme_rdma]
nvme_rdma_destroy_io_queues+0x88/0xd0 [nvme_rdma]
nvme_rdma_error_recovery_work+0x52/0xf0 [nvme_rdma]
process_one_work+0x17a/0x440
worker_thread+0x126/0x3c0
kthread+0xcf/0xe0
ret_from_fork+0x58/0x90
0xffffffffffffffff
quiesce_qp() waits until all outstanding packets have been freed.
This wait should be momentary. During a link down event, the cleanup
handling does not ensure that all packets caught by the link down are
flushed properly.
This is caused by the fact that the freeze path and the link down
event is handled the same. This is not correct. The freeze path
waits until the HFI is unfrozen and then restarts PIO. A link down
is not a freeze event. The link down path cannot restart the PIO
until link is restored. If the PIO path is restarted before the link
comes up, the application (QP) using the PIO path will hang (until
link is restored).
Fix by separating the linkdown path from the freeze path and use the
link down path for link down events.
Close a race condition sc_disable() by acquiring both the progress
and release locks.
Close a race condition in sc_stop() by moving the setting of the flag
bits under the alloc lock.
Cc: <stable@vger.kernel.org> # 4.9.x+
Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
If a packet stream uses an UnsupportedVL (virtual lane), the send
engine will not send the packet, and it will not indicate that an
error has occurred. This will cause the packet stream to block.
HFI has 8 virtual lanes available for packet streams. Each lane can
be enabled or disabled using the UnsupportedVL mask. If a lane is
disabled, adding a packet to the send context must be disallowed.
The current mask for determining unsupported VLs defaults to 0 (allow
all). This is incorrect. Only the VLs that are defined should be
allowed.
Determine which VLs are disabled (mtu == 0), and set the appropriate
unsupported bit in the mask. The correct mask will allow the send
engine to error on the invalid VL, and error recovery will work
correctly.
Cc: <stable@vger.kernel.org> # 4.9.x+
Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
If the number of packets in a user sdma request does not match
the actual iovectors being sent, sdma_cleanup can be called on
an uninitialized request structure, resulting in a crash similar
to this:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: [<ffffffffc0ae8bb7>] __sdma_txclean+0x57/0x1e0 [hfi1]
PGD 8000001044f61067 PUD 1052706067 PMD 0
Oops: 0000 [#1] SMP
CPU: 30 PID: 69912 Comm: upsm Kdump: loaded Tainted: G OE
------------ 3.10.0-862.el7.x86_64 #1
Hardware name: Intel Corporation S2600KPR/S2600KPR, BIOS
SE5C610.86B.01.01.0019.101220160604 10/12/2016
task: ffff8b331c890000 ti: ffff8b2ed1f98000 task.ti: ffff8b2ed1f98000
RIP: 0010:[<ffffffffc0ae8bb7>] [<ffffffffc0ae8bb7>] __sdma_txclean+0x57/0x1e0
[hfi1]
RSP: 0018:ffff8b2ed1f9bab0 EFLAGS: 00010286
RAX: 0000000000008b2b RBX: ffff8b2adf6e0000 RCX: 0000000000000000
RDX: 00000000000000a0 RSI: ffff8b2e9eedc540 RDI: ffff8b2adf6e0000
RBP: ffff8b2ed1f9bad8 R08: 0000000000000000 R09: ffffffffc0b04a06
R10: ffff8b331c890190 R11: ffffe6ed00bf1840 R12: ffff8b3315480000
R13: ffff8b33154800f0 R14: 00000000fffffff2 R15: ffff8b2e9eedc540
FS: 00007f035ac47740(0000) GS:ffff8b331e100000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 0000000c03fe6000 CR4: 00000000001607e0
Call Trace:
[<ffffffffc0b0570d>] user_sdma_send_pkts+0xdcd/0x1990 [hfi1]
[<ffffffff9fe75fb0>] ? gup_pud_range+0x140/0x290
[<ffffffffc0ad3105>] ? hfi1_mmu_rb_insert+0x155/0x1b0 [hfi1]
[<ffffffffc0b0777b>] hfi1_user_sdma_process_request+0xc5b/0x11b0 [hfi1]
[<ffffffffc0ac193a>] hfi1_aio_write+0xba/0x110 [hfi1]
[<ffffffffa001a2bb>] do_sync_readv_writev+0x7b/0xd0
[<ffffffffa001bede>] do_readv_writev+0xce/0x260
[<ffffffffa022b089>] ? tty_ldisc_deref+0x19/0x20
[<ffffffffa02268c0>] ? n_tty_ioctl+0xe0/0xe0
[<ffffffffa001c105>] vfs_writev+0x35/0x60
[<ffffffffa001c2bf>] SyS_writev+0x7f/0x110
[<ffffffffa051f7d5>] system_call_fastpath+0x1c/0x21
Code: 06 49 c7 47 18 00 00 00 00 0f 87 89 01 00 00 5b 41 5c 41 5d 41 5e 41 5f
5d c3 66 2e 0f 1f 84 00 00 00 00 00 48 8b 4e 10 48 89 fb <48> 8b 51 08 49 89 d4
83 e2 0c 41 81 e4 00 e0 00 00 48 c1 ea 02
RIP [<ffffffffc0ae8bb7>] __sdma_txclean+0x57/0x1e0 [hfi1]
RSP <ffff8b2ed1f9bab0>
CR2: 0000000000000008
There are two exit points from user_sdma_send_pkts(). One (free_tx)
merely frees the slab entry and one (free_txreq) cleans the sdma_txreq
prior to freeing the slab entry. The free_txreq variation can only be
called after one of the sdma_init*() variations has been called.
In the panic case, the slab entry had been allocated but not inited.
Fix the issue by exiting through free_tx thus avoiding sdma_clean().
Cc: <stable@vger.kernel.org> # 4.9.x+
Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The SL specified by a user needs to be a valid SL.
Add a range check to the user specified SL value which protects from
running off the end of the SL to SC table.
CC: stable@vger.kernel.org
Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|/
|
|
|
|
|
|
|
|
|
|
|
| |
Calling into the new API to reset the secondary bus results in a deadlock.
This occurs because the device/bus is already locked at probe time.
Reverting back to the old behavior while the API is improved.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=200985
Fixes: c6a44ba950d1 ("PCI: Rename pci_try_reset_bus() to pci_reset_bus()")
Fixes: 409888e0966e ("IB/hfi1: Use pci_try_reset_bus() for initiating PCI Secondary Bus Reset")
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Cc: Sinan Kaya <okaya@codeaurora.org>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Pull more rdma updates from Jason Gunthorpe:
"This is the SMC cleanup promised, a randconfig regression fix, and
kernel oops fix.
Summary:
- Switch SMC over to rdma_get_gid_attr and remove the compat
- Fix a crash in HFI1 with some BIOS's
- Fix a randconfig failure"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
IB/ucm: fix UCM link error
IB/hfi1: Invalid NUMA node information can cause a divide by zero
RDMA/smc: Replace ib_query_gid with rdma_get_gid_attr
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If the system BIOS does not supply NUMA node information to the
PCI devices, the NUMA node is selected by choosing the current
node.
This can lead to the following crash:
divide error: 0000 SMP
CPU: 0 PID: 4 Comm: kworker/0:0 Tainted: G IOE
------------ 3.10.0-693.21.1.el7.x86_64 #1
Hardware name: Intel Corporation S2600KP/S2600KP, BIOS
SE5C610.86B.01.01.0005.101720141054 10/17/2014
Workqueue: events work_for_cpu_fn
task: ffff880174480fd0 ti: ffff880174488000 task.ti: ffff880174488000
RIP: 0010: [<ffffffffc020ac69>] hfi1_dev_affinity_init+0x129/0x6a0 [hfi1]
RSP: 0018:ffff88017448bbf8 EFLAGS: 00010246
RAX: 0000000000000011 RBX: ffff88107ffba6c0 RCX: ffff88085c22e130
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880824ad0000
RBP: ffff88017448bc48 R08: 0000000000000011 R09: 0000000000000002
R10: ffff8808582b6ca0 R11: 0000000000003151 R12: ffff8808582b6ca0
R13: ffff8808582b6518 R14: ffff8808582b6010 R15: 0000000000000012
FS: 0000000000000000(0000) GS:ffff88085ec00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007efc707404f0 CR3: 0000000001a02000 CR4: 00000000001607f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Call Trace:
hfi1_init_dd+0x14b3/0x27a0 [hfi1]
? pcie_capability_write_word+0x46/0x70
? hfi1_pcie_init+0xc0/0x200 [hfi1]
do_init_one+0x153/0x4c0 [hfi1]
? sched_clock_cpu+0x85/0xc0
init_one+0x1b5/0x260 [hfi1]
local_pci_probe+0x4a/0xb0
work_for_cpu_fn+0x1a/0x30
process_one_work+0x17f/0x440
worker_thread+0x278/0x3c0
? manage_workers.isra.24+0x2a0/0x2a0
kthread+0xd1/0xe0
? insert_kthread_work+0x40/0x40
ret_from_fork+0x77/0xb0
? insert_kthread_work+0x40/0x40
If the BIOS is not supplying NUMA information:
- set the default table count to 1 for all possible nodes
- select node 0 (instead of current NUMA) node to get consistent
performance
- generate an error indicating that the BIOS should be upgraded
Reviewed-by: Gary Leshner <gary.s.leshner@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are several blockable mmu notifiers which might sleep in
mmu_notifier_invalidate_range_start and that is a problem for the
oom_reaper because it needs to guarantee a forward progress so it cannot
depend on any sleepable locks.
Currently we simply back off and mark an oom victim with blockable mmu
notifiers as done after a short sleep. That can result in selecting a new
oom victim prematurely because the previous one still hasn't torn its
memory down yet.
We can do much better though. Even if mmu notifiers use sleepable locks
there is no reason to automatically assume those locks are held. Moreover
majority of notifiers only care about a portion of the address space and
there is absolutely zero reason to fail when we are unmapping an unrelated
range. Many notifiers do really block and wait for HW which is harder to
handle and we have to bail out though.
This patch handles the low hanging fruit.
__mmu_notifier_invalidate_range_start gets a blockable flag and callbacks
are not allowed to sleep if the flag is set to false. This is achieved by
using trylock instead of the sleepable lock for most callbacks and
continue as long as we do not block down the call chain.
I think we can improve that even further because there is a common pattern
to do a range lookup first and then do something about that. The first
part can be done without a sleeping lock in most cases AFAICS.
The oom_reaper end then simply retries if there is at least one notifier
which couldn't make any progress in !blockable mode. A retry loop is
already implemented to wait for the mmap_sem and this is basically the
same thing.
The simplest way for driver developers to test this code path is to wrap
userspace code which uses these notifiers into a memcg and set the hard
limit to hit the oom. This can be done e.g. after the test faults in all
the mmu notifier managed memory and set the hard limit to something really
small. Then we are looking for a proper process tear down.
[akpm@linux-foundation.org: coding style fixes]
[akpm@linux-foundation.org: minor code simplification]
Link: http://lkml.kernel.org/r/20180716115058.5559-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Christian König <christian.koenig@amd.com> # AMD notifiers
Acked-by: Leon Romanovsky <leonro@mellanox.com> # mlx and umem_odp
Reported-by: David Rientjes <rientjes@google.com>
Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
rdma.git merge resolution for the 4.19 merge window
Conflicts:
drivers/infiniband/core/rdma_core.c
- Use the rdma code and revise with the new spelling for
atomic_fetch_add_unless
drivers/nvme/host/rdma.c
- Replace max_sge with max_send_sge in new blk code
drivers/nvme/target/rdma.c
- Use the blk code and revise to use NULL for ib_post_recv when
appropriate
- Replace max_sge with max_recv_sge in new blk code
net/rds/ib_send.c
- Use the net code and revise to use NULL for ib_post_recv when
appropriate
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |\
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull pci updates from Bjorn Helgaas:
- Decode AER errors with names similar to "lspci" (Tyler Baicar)
- Expose AER statistics in sysfs (Rajat Jain)
- Clear AER status bits selectively based on the type of recovery (Oza
Pawandeep)
- Honor "pcie_ports=native" even if HEST sets FIRMWARE_FIRST (Alexandru
Gagniuc)
- Don't clear AER status bits if we're using the "Firmware-First"
strategy where firmware owns the registers (Alexandru Gagniuc)
- Use sysfs_match_string() to simplify ASPM sysfs parsing (Andy
Shevchenko)
- Remove unnecessary includes of <linux/pci-aspm.h> (Bjorn Helgaas)
- Defer DPC event handling to work queue (Keith Busch)
- Use threaded IRQ for DPC bottom half (Keith Busch)
- Print AER status while handling DPC events (Keith Busch)
- Work around IDT switch ACS Source Validation erratum (James
Puthukattukaran)
- Emit diagnostics for all cases of PCIe Link downtraining (Links
operating slower than they're capable of) (Alexandru Gagniuc)
- Skip VFs when configuring Max Payload Size (Myron Stowe)
- Reduce Root Port Max Payload Size if necessary when hot-adding a
device below it (Myron Stowe)
- Simplify SHPC existence/permission checks (Bjorn Helgaas)
- Remove hotplug sample skeleton driver (Lukas Wunner)
- Convert pciehp to threaded IRQ handling (Lukas Wunner)
- Improve pciehp tolerance of missed events and initially unstable
links (Lukas Wunner)
- Clear spurious pciehp events on resume (Lukas Wunner)
- Add pciehp runtime PM support, including for Thunderbolt controllers
(Lukas Wunner)
- Support interrupts from pciehp bridges in D3hot (Lukas Wunner)
- Mark fall-through switch cases before enabling -Wimplicit-fallthrough
(Gustavo A. R. Silva)
- Move DMA-debug PCI init from arch code to PCI core (Christoph
Hellwig)
- Fix pci_request_irq() usage of IRQF_ONESHOT when no handler is
supplied (Heiner Kallweit)
- Unify PCI and DMA direction #defines (Shunyong Yang)
- Add PCI_DEVICE_DATA() macro (Andy Shevchenko)
- Check for VPD completion before checking for timeout (Bert Kenward)
- Limit Netronome NFP5000 config space size to work around erratum
(Jakub Kicinski)
- Set IRQCHIP_ONESHOT_SAFE for PCI MSI irqchips (Heiner Kallweit)
- Document ACPI description of PCI host bridges (Bjorn Helgaas)
- Add "pci=disable_acs_redir=" parameter to disable ACS redirection for
peer-to-peer DMA support (we don't have the peer-to-peer support yet;
this is just one piece) (Logan Gunthorpe)
- Clean up devm_of_pci_get_host_bridge_resources() resource allocation
(Jan Kiszka)
- Fixup resizable BARs after suspend/resume (Christian König)
- Make "pci=earlydump" generic (Sinan Kaya)
- Fix ROM BAR access routines to stay in bounds and check for signature
correctly (Rex Zhu)
- Add DMA alias quirk for Microsemi Switchtec NTB (Doug Meyer)
- Expand documentation for pci_add_dma_alias() (Logan Gunthorpe)
- To avoid bus errors, enable PASID only if entire path supports
End-End TLP prefixes (Sinan Kaya)
- Unify slot and bus reset functions and remove hotplug knowledge from
callers (Sinan Kaya)
- Add Function-Level Reset quirks for Intel and Samsung NVMe devices to
fix guest reboot issues (Alex Williamson)
- Add function 1 DMA alias quirk for Marvell 88SS9183 PCIe SSD
Controller (Bjorn Helgaas)
- Remove Xilinx AXI-PCIe host bridge arch dependency (Palmer Dabbelt)
- Remove Aardvark outbound window configuration (Evan Wang)
- Fix Aardvark bridge window sizing issue (Zachary Zhang)
- Convert Aardvark to use pci_host_probe() to reduce code duplication
(Thomas Petazzoni)
- Correct the Cadence cdns_pcie_writel() signature (Alan Douglas)
- Add Cadence support for optional generic PHYs (Alan Douglas)
- Add Cadence power management ops (Alan Douglas)
- Remove redundant variable from Cadence driver (Colin Ian King)
- Add Kirin MSI support (Xiaowei Song)
- Drop unnecessary root_bus_nr setting from exynos, imx6, keystone,
armada8k, artpec6, designware-plat, histb, qcom, spear13xx (Shawn
Guo)
- Move link notification settings from DesignWare core to individual
drivers (Gustavo Pimentel)
- Add endpoint library MSI-X interfaces (Gustavo Pimentel)
- Correct signature of endpoint library IRQ interfaces (Gustavo
Pimentel)
- Add DesignWare endpoint library MSI-X callbacks (Gustavo Pimentel)
- Add endpoint library MSI-X test support (Gustavo Pimentel)
- Remove unnecessary GFP_ATOMIC from Hyper-V "new child" allocation
(Jia-Ju Bai)
- Add more devices to Broadcom PAXC quirk (Ray Jui)
- Work around corrupted Broadcom PAXC config space to enable SMMU and
GICv3 ITS (Ray Jui)
- Disable MSI parsing to work around broken Broadcom PAXC logic in some
devices (Ray Jui)
- Hide unconfigured functions to work around a Broadcom PAXC defect
(Ray Jui)
- Lower iproc log level to reduce console output during boot (Ray Jui)
- Fix mobiveil iomem/phys_addr_t type usage (Lorenzo Pieralisi)
- Fix mobiveil missing include file (Lorenzo Pieralisi)
- Add mobiveil Kconfig/Makefile support (Lorenzo Pieralisi)
- Fix mvebu I/O space remapping issues (Thomas Petazzoni)
- Use generic pci_host_bridge in mvebu instead of ARM-specific API
(Thomas Petazzoni)
- Whitelist VMD devices with fast interrupt handlers to avoid sharing
vectors with slow handlers (Keith Busch)
* tag 'pci-v4.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (153 commits)
PCI/AER: Don't clear AER bits if error handling is Firmware-First
PCI: Limit config space size for Netronome NFP5000
PCI/MSI: Set IRQCHIP_ONESHOT_SAFE for PCI-MSI irqchips
PCI/VPD: Check for VPD access completion before checking for timeout
PCI: Add PCI_DEVICE_DATA() macro to fully describe device ID entry
PCI: Match Root Port's MPS to endpoint's MPSS as necessary
PCI: Skip MPS logic for Virtual Functions (VFs)
PCI: Add function 1 DMA alias quirk for Marvell 88SS9183
PCI: Check for PCIe Link downtraining
PCI: Add ACS Redirect disable quirk for Intel Sunrise Point
PCI: Add device-specific ACS Redirect disable infrastructure
PCI: Convert device-specific ACS quirks from NULL termination to ARRAY_SIZE
PCI: Add "pci=disable_acs_redir=" parameter for peer-to-peer support
PCI: Allow specifying devices using a base bus and path of devfns
PCI: Make specifying PCI devices in kernel parameters reusable
PCI: Hide ACS quirk declarations inside PCI core
PCI: Delay after FLR of Intel DC P3700 NVMe
PCI: Disable Samsung SM961/PM961 NVMe before FLR
PCI: Export pcie_has_flr()
PCI: mvebu: Drop bogus comment above mvebu_pcie_map_registers()
...
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Now that the old implementation of pci_reset_bus() is gone, replace
pci_try_reset_bus() with pci_reset_bus().
Compared to the old implementation, new code will fail immmediately with
-EAGAIN if object lock cannot be obtained.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Drivers are expected to call pci_try_reset_slot() or pci_try_reset_bus() by
querying if a system supports hotplug or not. A survey showed that most
drivers don't do this and we are leaking hotplug capability to the user.
Hide pci_try_slot_reset() from drivers and embed into pci_try_bus_reset().
Change pci_try_reset_bus() parameter from struct pci_bus to struct pci_dev.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Getting ready to hide pci_reset_bridge_secondary_bus() from the drivers.
pci_reset_bridge_secondary_bus() should only be used internally by the
PCI code itself.
Other drivers should rely on higher level pci_try_reset_bus() API.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
| |\ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
All conflicts were trivial overlapping changes, so reasonably
easy to resolve.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |/
| |/|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This patch makes it so that instead of passing a void pointer as the
accel_priv we instead pass a net_device pointer as sb_dev. Making this
change allows us to pass the subordinate device through to the fallback
function eventually so that we can keep the actual code in the
ndo_select_queue call as focused on possible on the exception cases.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|\ \ \
| | |/
| |/|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Resolve merge conflicts from the -rc cycle against the rdma.git tree:
Conflicts:
drivers/infiniband/core/uverbs_cmd.c
- New ifs added to ib_uverbs_ex_create_flow in -rc and for-next
- Merge removal of file->ucontext in for-next with new code in -rc
drivers/infiniband/core/uverbs_main.c
- for-next removed code from ib_uverbs_write() that was modified
in for-rc
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The __get_txreq() function can return a pointer, ERR_PTR(-EBUSY), or NULL.
All of the relevant call sites look for IS_ERR, so the NULL return would
lead to a NULL pointer exception.
Do not use the ERR_PTR mechanism for this function.
Update all call sites to handle the return value correctly.
Clean up error paths to reflect return value.
Fixes: 45842abbb292 ("staging/rdma/hfi1: move txreq header code")
Cc: <stable@vger.kernel.org> # 4.9.x+
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Kamenee Arumugam <kamenee.arumugam@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Avoid that the following compiler warning is reported when building
with gcc 8:
drivers/infiniband/hw/hfi1/verbs.c:1896:2: warning: 'strncpy' output may be truncated copying 64 bytes from a string of length 64 [-Wstringop-truncation]
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
grh_required is intended to be a global setting where all AV's will
require a GRH, not just the sm_lid. Move the special logic to the creation
of the SM AH.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The general interrupt handler is_rcv_avail_int() has two paths,
do_interrupt() (callback) and handle_user_interrupt(). The
do_interrupt() callback is for the threaded receive handling.
is_rcv_avail_int() cannot handle threaded IRQs.
If the do_interrupt() path is taken, and the IRQ returns
IRQ_WAKE_THREAD, the IRQ behavior will be indeterminate.
Remove incorrect call to do_interrupt() from is_rcv_avail_int(),
leaving the un-threaded (handle_user_interrupt()) path.
Fixes: f4f30031c33c ("staging/rdma/hfi1: Thread the receive interrupt.")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Kamenee Arumugam <kamenee.arumugam@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The in_use_ctxts bitmask is for user receive contexts only. Setting it for
any other type of receive context is incorrect.
Move initial set of in_use_ctxts bits from the general context init to the
user context specific init. Having this bit set can allow contexts to be
incorrectly identified by some IRQ handlers. This will allow
handle_user_interrupt() will now filter user contexts correctly.
Clean up redundant is_rcv_urgent_int() user context check.
A follow on patch will clean up an incorrect code path in the
is_rcv_avail_int().
Fixes: 8737ce95c463 ("IB/hfi1: Fix an assign/ordering issue with shared context IDs")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Kamenee Arumugam <kamenee.arumugam@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The INTx IRQ support does not work for all HF1 IRQ handlers
(specifically the receive data IRQs).
Remove all supporting code for the INTx IRQ.
If the requested MSIx vector request is unsuccessful, do not allow the
driver to continue.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Kamenee Arumugam <kamenee.arumugam@intel.com>
Reviewed-by: Sadanand Warrier <sadanand.warrier@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Many fields in ctxtdata are incorrectly sized and the organization of the
fields within the structure is a jumble.
Fix by:
- Correcting oversize fields.
- Putting fields common to all contexts at the top with hot fields
at the top.
- Moving PSM fields to the bottom of the structure.
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Remove the sizeable cache of the chip sizing CSRs and replace with CSR
reads as needed.
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| | |
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Fields in this structure are sized excessively based on hardware
limitations and input values.
Fix by reducing fields as appropriate and repositioning to close holes in
the structure.
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
It is only ever written.
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The usage of this ctxt data field is not hot path and the value can be
computed on demand to cut down the ctxtdata bloat.
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Move some s_flags defines out of rdmavt and into hfi1 because they are
hfi1 specific and therefore should remain in the driver instead of
bubbling up to rdmavt.
Document device specific ranges in rdmavt and remap
those in hfi1.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The field is based on a constant that can never change.
Use the define to assign the register instead.
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This field should be in ctxtdata to allow for better locality of access by
eliminating a dd dereference.
The new field is now side-by-side with rcvhdrqentsize since the rhf_offset
is a function of the rcvhdrqentsize.
Both fields are now correctly sized as u8.
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The current implementation precludes having receive context specific
packet type receive handlers.
Fix this by adding adding c99 const array for the existing handlers and
remove the current 72 bytes of pointers from devdata.
A new pointer in hfi1_ctxtdata will point to the const array.
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch replaces the ib_device_attr.max_sge with max_send_sge and
max_recv_sge. It allows ulps to take advantage of devices that have very
different send and recv sge depths. For example cxgb4 has a max_recv_sge
of 4, yet a max_send_sge of 16. Splitting out these attributes allows
much more efficient use of the SQ for cxgb4 with ulps that use the RDMA_RW
API. Consider a large RDMA WRITE that has 16 scattergather entries.
With max_sge of 4, the ulp would send 4 WRITE WRs, but with max_sge of
16, it can be done with 1 WRITE WR.
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The kvzalloc_node() function has no 2-factor argument form, so
multiplication factors need to be wrapped in array_size(). This patch
replaces cases of:
kvzalloc_node(a * b, gfp, node)
with:
kvzalloc_node(array_size(a, b), gfp, node)
as well as handling cases of:
kvzalloc_node(a * b * c, gfp, node)
with:
kvzalloc_node(array3_size(a, b, c), gfp, node)
This does, however, attempt to ignore constant size factors like:
kvzalloc_node(4 * 1024, gfp, node)
though any constants defined via macros get caught up in the conversion.
Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.
The Coccinelle script used for this was:
// Fix redundant parens around sizeof().
@@
type TYPE;
expression THING, E;
@@
(
kvzalloc_node(
- (sizeof(TYPE)) * E
+ sizeof(TYPE) * E
, ...)
|
kvzalloc_node(
- (sizeof(THING)) * E
+ sizeof(THING) * E
, ...)
)
// Drop single-byte sizes and redundant parens.
@@
expression COUNT;
typedef u8;
typedef __u8;
@@
(
kvzalloc_node(
- sizeof(u8) * (COUNT)
+ COUNT
, ...)
|
kvzalloc_node(
- sizeof(__u8) * (COUNT)
+ COUNT
, ...)
|
kvzalloc_node(
- sizeof(char) * (COUNT)
+ COUNT
, ...)
|
kvzalloc_node(
- sizeof(unsigned char) * (COUNT)
+ COUNT
, ...)
|
kvzalloc_node(
- sizeof(u8) * COUNT
+ COUNT
, ...)
|
kvzalloc_node(
- sizeof(__u8) * COUNT
+ COUNT
, ...)
|
kvzalloc_node(
- sizeof(char) * COUNT
+ COUNT
, ...)
|
kvzalloc_node(
- sizeof(unsigned char) * COUNT
+ COUNT
, ...)
)
// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@
(
kvzalloc_node(
- sizeof(TYPE) * (COUNT_ID)
+ array_size(COUNT_ID, sizeof(TYPE))
, ...)
|
kvzalloc_node(
- sizeof(TYPE) * COUNT_ID
+ array_size(COUNT_ID, sizeof(TYPE))
, ...)
|
kvzalloc_node(
- sizeof(TYPE) * (COUNT_CONST)
+ array_size(COUNT_CONST, sizeof(TYPE))
, ...)
|
kvzalloc_node(
- sizeof(TYPE) * COUNT_CONST
+ array_size(COUNT_CONST, sizeof(TYPE))
, ...)
|
kvzalloc_node(
- sizeof(THING) * (COUNT_ID)
+ array_size(COUNT_ID, sizeof(THING))
, ...)
|
kvzalloc_node(
- sizeof(THING) * COUNT_ID
+ array_size(COUNT_ID, sizeof(THING))
, ...)
|
kvzalloc_node(
- sizeof(THING) * (COUNT_CONST)
+ array_size(COUNT_CONST, sizeof(THING))
, ...)
|
kvzalloc_node(
- sizeof(THING) * COUNT_CONST
+ array_size(COUNT_CONST, sizeof(THING))
, ...)
)
// 2-factor product, only identifiers.
@@
identifier SIZE, COUNT;
@@
kvzalloc_node(
- SIZE * COUNT
+ array_size(COUNT, SIZE)
, ...)
// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@
(
kvzalloc_node(
- sizeof(TYPE) * (COUNT) * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kvzalloc_node(
- sizeof(TYPE) * (COUNT) * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kvzalloc_node(
- sizeof(TYPE) * COUNT * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kvzalloc_node(
- sizeof(TYPE) * COUNT * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kvzalloc_node(
- sizeof(THING) * (COUNT) * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kvzalloc_node(
- sizeof(THING) * (COUNT) * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kvzalloc_node(
- sizeof(THING) * COUNT * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kvzalloc_node(
- sizeof(THING) * COUNT * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
)
// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@
(
kvzalloc_node(
- sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+ array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
, ...)
|
kvzalloc_node(
- sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
, ...)
|
kvzalloc_node(
- sizeof(THING1) * sizeof(THING2) * COUNT
+ array3_size(COUNT, sizeof(THING1), sizeof(THING2))
, ...)
|
kvzalloc_node(
- sizeof(THING1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(THING1), sizeof(THING2))
, ...)
|
kvzalloc_node(
- sizeof(TYPE1) * sizeof(THING2) * COUNT
+ array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
, ...)
|
kvzalloc_node(
- sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
, ...)
)
// 3-factor product, only identifiers, with redundant parens removed.
@@
identifier STRIDE, SIZE, COUNT;
@@
(
kvzalloc_node(
- (COUNT) * STRIDE * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kvzalloc_node(
- COUNT * (STRIDE) * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kvzalloc_node(
- COUNT * STRIDE * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kvzalloc_node(
- (COUNT) * (STRIDE) * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kvzalloc_node(
- COUNT * (STRIDE) * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kvzalloc_node(
- (COUNT) * STRIDE * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kvzalloc_node(
- (COUNT) * (STRIDE) * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kvzalloc_node(
- COUNT * STRIDE * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
)
// Any remaining multi-factor products, first at least 3-factor products
// when they're not all constants...
@@
expression E1, E2, E3;
constant C1, C2, C3;
@@
(
kvzalloc_node(C1 * C2 * C3, ...)
|
kvzalloc_node(
- E1 * E2 * E3
+ array3_size(E1, E2, E3)
, ...)
)
// And then all remaining 2 factors products when they're not all constants.
@@
expression E1, E2;
constant C1, C2;
@@
(
kvzalloc_node(C1 * C2, ...)
|
kvzalloc_node(
- E1 * E2
+ array_size(E1, E2)
, ...)
)
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Pull rdma updates from Jason Gunthorpe:
"This has been a quiet cycle for RDMA, the big bulk is the usual
smallish driver updates and bug fixes. About four new uAPI related
things. Not as much Szykaller patches this time, the bugs it finds are
getting harder to fix.
Summary:
- More work cleaning up the RDMA CM code
- Usual driver bug fixes and cleanups for qedr, qib, hfi1, hns,
i40iw, iw_cxgb4, mlx5, rxe
- Driver specific resource tracking and reporting via netlink
- Continued work for name space support from Parav
- MPLS support for the verbs flow steering uAPI
- A few tricky IPoIB fixes improving robustness
- HFI1 driver support for the '16B' management packet format
- Some auditing to not print kernel pointers via %llx or similar
- Mark the entire 'UCM' user-space interface as BROKEN with the
intent to remove it entirely. The user space side of this was long
ago replaced with RDMA-CM and syzkaller is finding bugs in the
residual UCM interface nobody wishes to fix because nobody uses it.
- Purge more bogus BUG_ON's from Leon
- 'flow counters' verbs uAPI
- T10 fixups for iser/isert, these are Acked by Martin but going
through the RDMA tree due to dependencies"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (138 commits)
RDMA/mlx5: Update SPDX tags to show proper license
RDMA/restrack: Change SPDX tag to properly reflect license
IB/hfi1: Fix comment on default hdr entry size
IB/hfi1: Rename exp_lock to exp_mutex
IB/hfi1: Add bypass register defines and replace blind constants
IB/hfi1: Remove unused variable
IB/hfi1: Ensure VL index is within bounds
IB/hfi1: Fix user context tail allocation for DMA_RTAIL
IB/hns: Use zeroing memory allocator instead of allocator/memset
infiniband: fix a possible use-after-free bug
iw_cxgb4: add INFINIBAND_ADDR_TRANS dependency
IB/isert: use T10-PI check mask definitions from core layer
IB/iser: use T10-PI check mask definitions from core layer
RDMA/core: introduce check masks for T10-PI offload
IB/isert: fix T10-pi check mask setting
IB/mlx5: Add counters read support
IB/mlx5: Add flow counters read support
IB/mlx5: Add flow counters binding support
IB/mlx5: Add counters create and destroy support
IB/uverbs: Add support for flow counters
...
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The comment for the default header queue entry size is incorrect.
Correct the comment and fix the resulting S_IRUGO warning that shows
up in the widened patch context.
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The mutex exp_lock in struct hfi1_ctxtdata is used to protect all
Expected TID data of a user context. This patch renames it to exp_mutex
to better reflect its identity and prepare for upcoming patches.
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
These registers were not added in the 16B work.
Add them and replace blind constants with the correct defines.
Fixes: 72c07e2b671e ("IB/hfi1: Add support to receive 16B bypass packets")
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
The variable extended_psn was not used any more.
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Improve the safety of the code and ensure the array cannot be indexed
out of bounds when picking the CPU for a given SDMA engine.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The following code fails to allocate a buffer for the
tail address that the hardware DMAs into when the user
context DMA_RTAIL is set.
if (HFI1_CAP_KGET_MASK(rcd->flags, DMA_RTAIL)) {
rcd->rcvhdrtail_kvaddr = dma_zalloc_coherent(
&dd->pcidev->dev, PAGE_SIZE, &dma_hdrqtail,
gfp_flags);
if (!rcd->rcvhdrtail_kvaddr)
goto bail_free;
rcd->rcvhdrqtailaddr_dma = dma_hdrqtail;
}
So the rcvhdrtail_kvaddr would then be NULL.
The mmap logic fails to check for a NULL rcvhdrtail_kvaddr.
The fix is to test for both user and kernel DMA_TAIL options
during the allocation as well as testing for a NULL
rcvhdrtail_kvaddr during the mmap processing.
Additionally, all downstream testing of the capmask for DMA_RTAIL
have been eliminated in favor of testing rcvhdrtail_kvaddr.
Cc: <stable@vger.kernel.org> # 4.9.x
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
| |\
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma for-next
Update mlx4 to support user MR creation against read-only memory, previously
it required the memory to be writable.
Based on rdma for-rc due to dependencies.
* mr_fix: (2 commits)
IB/mlx4: Mark user MR as writable if actual virtual memory is writable
IB/core: Make testing MR flags for writability a static inline function
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Given we are dealing with nano-second level timers, when the timer
pops, ensure it happens on the CPU which caused the timer to be set
in the first place. This avoids excessive jitter from the desired
expiration time by avoiding the cost of switching our context to
another CPU that is cache cold for this given timer.
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
For errorinfo MAD requests, the response has a 0 port number left over
from a memset. Instead we should always set the port number in the
response.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The knowledge of the internal workings of the expect receive
is too distributed.
Fix by:
- right size several rcd fields associated with
expect receive
- making an init entrance to init all the lists
- consolidate all the allocations into an array anchored
in the rcd
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Add trace support for 16B Management Packets.
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
16B Management Packets (L4=0x08) replace the BTH and DETH
of normal MAD packet packets with a header containing the
the source and destination queue pair numbers; fields that
were originally retrieved from the BTH/DETH are now populated
from this header as well as from the 16B LRH (e.g. pkey).
16B Management Packets are used as an optimized management
format on 16B fabrics.
These management packets have an opcode of IB_OPCODE_UD_SEND_ONLY,
a fixed 3Byte pad, and a header length of 24Bytes.
The decision as to when we send a management packet is based
upon either the source or destination queue pair number being
0 or 1.
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Add 16B Management Packet definition. This optimized packet
format replaces the ib_other_headers and BTH with a source
and destination QP number.
To support these packets we introduce struct opa_16b_mgmt
into the struct hfi1_16b_header.
This packet format is only used for MAD packets using the
IB_OPCODE_UD_SEND_ONLY opcode on QP0/1.
The original 16B implementation failed to use 16B management
packets so now we add their definition.
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
A recent patch set to rework the usage of debugfs and to add fault
injection capabilities via debugfs files to the hfi1 driver introduced a
build error that only shows up when debugfs is fully disabled. The
patchset mistakenly defines some empty stub functions in two different
headers when debugfs is disabled. Remove the set that shouldn't have
been there to resolve the issue.
Fixes: a74d5307caba ("IB/hfi1: Rework fault injection machinery")
Signed-off-by: Doug Ledford <dledford@redhat.com>
|