summaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/sw/rdmavt
diff options
context:
space:
mode:
authorLinus Torvalds2017-05-03 21:45:55 +0200
committerLinus Torvalds2017-05-03 21:45:55 +0200
commit1684096b1ed813f621fb6cbd06e72235c1c2a0ca (patch)
tree13a228c35d6344f5d23b2c195aa3b026e42aac4b /drivers/infiniband/sw/rdmavt
parentMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dto... (diff)
parentinfiniband: avoid dereferencing uninitialized dst on error path (diff)
downloadkernel-qcow2-linux-1684096b1ed813f621fb6cbd06e72235c1c2a0ca.tar.gz
kernel-qcow2-linux-1684096b1ed813f621fb6cbd06e72235c1c2a0ca.tar.xz
kernel-qcow2-linux-1684096b1ed813f621fb6cbd06e72235c1c2a0ca.zip
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma updates from Doug Ledford: "More exchaustive description of primary updates in this release: - Lots of driver fixes and misc fixes across the board. - I had to base on a net-next tree because the IPoIB Accelorator patches needed it. Unfortunately, it was known to Mellanox that there would need to be an IPoIB accelorator patch to the net tree (which left some functions turned off by an #ifdef construct to avoid warnings about defined but unused functions), then one to the RDMA tree, then a fixup that went back and re-enabled the functions in the net tree and enabled their use in the rdma tree Also, a sparse fix was sent to the net tree after I did my pull, and the fixup patch conflicts quite directly with that sparse fix, so I'm going to submit the fixup patch towards the end of the merge window by itself and based upon your master branch at the time. - Two separate rounds of hfi1 fixes, one that got dropped from last release because it came in just a day or two before the end of the merge window and then the one from this release cycle. Of note is that I now have a third series that just landed from Intel yesterday. It is not included in this pull request, but I may submit it by the end of the week. I'll talk to Intel about improving the timing of thier submissions for my workflow. - Changes to our idr usage in the RDMA subsystem that will tie into our cgroup management and also into the upcoming changes for the RDMA kernel<->userspace API. - Addition of support for a netdev to be tied to an RDMA device at the core level - Addition of the VNIC driver from Intel. While IPoIB provides IP over InfiniBand (and *only* IP, no lower layer protocol headers are allowed or supported), the VNIC driver presents a virtual Ethernet device with support for things like varying Ethertypes, VLANs, priorities and other features of Ethernet. The virtual devices are centrally managed by the OPA fabric manager, making this (for the time being) a strictly OPA specific feature. - Improvements to the On-Demand Paging support in the RDMA subsystem. - Addition of three significant OPA changes. While we added OPA support some time ago (via the hfi1 driver), the RDMA subsystem has so far glossed over the areas where OPA and InfiniBand differ. With this release we are starting to add support for the OPA extensions into the RDMA core in the following area: Extended port information for OPA is now supported, extended Address Handle attributes for OPA are now supported, and extended SA Queries to get OPA specific subnet information is now supported. Concise summary from the tag: - idr usage and locking changes - build fix for hns - ipoib debug path record file fix - hfi1 updates - core RDMA netdev addition - Intel VNIC driver addition - Enhanced accelerators for IPoIB addition - Debug cleanups in cxgb3/4 - Trivial cleanups from SF Markus Elfring - Misc rxe fixes from Mellanox - Misc ipoib fixes from Mellanox - Lots of mlx4/mlx5 changes from Mellanox - Misc fixes across the RDMA subsystem - ODP paging fixes and improvements - qedr updates - hfi1 updates - OPA port info patches - OPA AH patches - OPA SA Query patches" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (191 commits) infiniband: avoid dereferencing uninitialized dst on error path IB/SA: Add OPA addr header IB/mlx5: Add port_xmit_wait to counter registers read IB/ocrdma: fix out of bounds access to local buffer IB/mlx4: Fix incorrect order of formal and actual parameters IB/mlx4: Change flush logic so it adheres to the variable name mlx5: Fix mlx5_ib_map_mr_sg mr length IB/rxe: Don't clamp residual length to mtu IB/SA: Add support to query OPA path records IB/SA: Add OPA path record type IB/SA: Split struct sa_path_rec based on IB and ROCE specific fields IB/SA: Introduce path record specific types IB/SA: Rename ib_sa_path_rec to sa_path_rec IB/CM: Add braces when using sizeof IB/core: Define 'opa' rdma_ah_attr type IB/core: Define 'ib' and 'roce' rdma_ah_attr types IB/core: Use rdma_ah_attr accessor functions IB/core: Add accessor functions for rdma_ah_attr fields IB/PVRDMA: Rename ib_ah_attr related functions IB/mthca: Rename to_ib_ah_attr to to_rdma_ah_attr ...
Diffstat (limited to 'drivers/infiniband/sw/rdmavt')
-rw-r--r--drivers/infiniband/sw/rdmavt/ah.c39
-rw-r--r--drivers/infiniband/sw/rdmavt/ah.h6
-rw-r--r--drivers/infiniband/sw/rdmavt/cq.c3
-rw-r--r--drivers/infiniband/sw/rdmavt/mad.c2
-rw-r--r--drivers/infiniband/sw/rdmavt/mcast.c61
-rw-r--r--drivers/infiniband/sw/rdmavt/mr.c57
-rw-r--r--drivers/infiniband/sw/rdmavt/qp.c44
-rw-r--r--drivers/infiniband/sw/rdmavt/trace.h4
-rw-r--r--drivers/infiniband/sw/rdmavt/trace_cq.h127
-rw-r--r--drivers/infiniband/sw/rdmavt/trace_rc.h109
-rw-r--r--drivers/infiniband/sw/rdmavt/trace_tx.h34
11 files changed, 397 insertions, 89 deletions
diff --git a/drivers/infiniband/sw/rdmavt/ah.c b/drivers/infiniband/sw/rdmavt/ah.c
index 16c446142c2a..a96d4aa80ae8 100644
--- a/drivers/infiniband/sw/rdmavt/ah.c
+++ b/drivers/infiniband/sw/rdmavt/ah.c
@@ -60,32 +60,35 @@
* Return: 0 on success
*/
int rvt_check_ah(struct ib_device *ibdev,
- struct ib_ah_attr *ah_attr)
+ struct rdma_ah_attr *ah_attr)
{
int err;
+ int port_num = rdma_ah_get_port_num(ah_attr);
struct ib_port_attr port_attr;
struct rvt_dev_info *rdi = ib_to_rvt(ibdev);
- enum rdma_link_layer link = rdma_port_get_link_layer(ibdev,
- ah_attr->port_num);
+ enum rdma_link_layer link = rdma_port_get_link_layer(ibdev, port_num);
+ u32 dlid = rdma_ah_get_dlid(ah_attr);
+ u8 ah_flags = rdma_ah_get_ah_flags(ah_attr);
+ u8 static_rate = rdma_ah_get_static_rate(ah_attr);
- err = ib_query_port(ibdev, ah_attr->port_num, &port_attr);
+ err = ib_query_port(ibdev, port_num, &port_attr);
if (err)
return -EINVAL;
- if (ah_attr->port_num < 1 ||
- ah_attr->port_num > ibdev->phys_port_cnt)
+ if (port_num < 1 ||
+ port_num > ibdev->phys_port_cnt)
return -EINVAL;
- if (ah_attr->static_rate != IB_RATE_PORT_CURRENT &&
- ib_rate_to_mbps(ah_attr->static_rate) < 0)
+ if (static_rate != IB_RATE_PORT_CURRENT &&
+ ib_rate_to_mbps(static_rate) < 0)
return -EINVAL;
- if ((ah_attr->ah_flags & IB_AH_GRH) &&
- ah_attr->grh.sgid_index >= port_attr.gid_tbl_len)
+ if ((ah_flags & IB_AH_GRH) &&
+ rdma_ah_read_grh(ah_attr)->sgid_index >= port_attr.gid_tbl_len)
return -EINVAL;
if (link != IB_LINK_LAYER_ETHERNET) {
- if (ah_attr->dlid == 0)
+ if (dlid == 0)
return -EINVAL;
- if (ah_attr->dlid >= be16_to_cpu(IB_MULTICAST_LID_BASE) &&
- ah_attr->dlid != be16_to_cpu(IB_LID_PERMISSIVE) &&
- !(ah_attr->ah_flags & IB_AH_GRH))
+ if (dlid >= be16_to_cpu(IB_MULTICAST_LID_BASE) &&
+ dlid != be16_to_cpu(IB_LID_PERMISSIVE) &&
+ !(ah_flags & IB_AH_GRH))
return -EINVAL;
}
if (rdi->driver_f.check_ah)
@@ -104,7 +107,7 @@ EXPORT_SYMBOL(rvt_check_ah);
* Return: newly allocated ah
*/
struct ib_ah *rvt_create_ah(struct ib_pd *pd,
- struct ib_ah_attr *ah_attr)
+ struct rdma_ah_attr *ah_attr)
{
struct rvt_ah *ah;
struct rvt_dev_info *dev = ib_to_rvt(pd->device);
@@ -119,7 +122,7 @@ struct ib_ah *rvt_create_ah(struct ib_pd *pd,
spin_lock_irqsave(&dev->n_ahs_lock, flags);
if (dev->n_ahs_allocated == dev->dparms.props.max_ah) {
- spin_unlock(&dev->n_ahs_lock);
+ spin_unlock_irqrestore(&dev->n_ahs_lock, flags);
kfree(ah);
return ERR_PTR(-ENOMEM);
}
@@ -167,7 +170,7 @@ int rvt_destroy_ah(struct ib_ah *ibah)
*
* Return: 0 on success
*/
-int rvt_modify_ah(struct ib_ah *ibah, struct ib_ah_attr *ah_attr)
+int rvt_modify_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr)
{
struct rvt_ah *ah = ibah_to_rvtah(ibah);
@@ -186,7 +189,7 @@ int rvt_modify_ah(struct ib_ah *ibah, struct ib_ah_attr *ah_attr)
*
* Return: always 0
*/
-int rvt_query_ah(struct ib_ah *ibah, struct ib_ah_attr *ah_attr)
+int rvt_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr)
{
struct rvt_ah *ah = ibah_to_rvtah(ibah);
diff --git a/drivers/infiniband/sw/rdmavt/ah.h b/drivers/infiniband/sw/rdmavt/ah.h
index e9c36be87d79..16105af99189 100644
--- a/drivers/infiniband/sw/rdmavt/ah.h
+++ b/drivers/infiniband/sw/rdmavt/ah.h
@@ -51,9 +51,9 @@
#include <rdma/rdma_vt.h>
struct ib_ah *rvt_create_ah(struct ib_pd *pd,
- struct ib_ah_attr *ah_attr);
+ struct rdma_ah_attr *ah_attr);
int rvt_destroy_ah(struct ib_ah *ibah);
-int rvt_modify_ah(struct ib_ah *ibah, struct ib_ah_attr *ah_attr);
-int rvt_query_ah(struct ib_ah *ibah, struct ib_ah_attr *ah_attr);
+int rvt_modify_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr);
+int rvt_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr);
#endif /* DEF_RVTAH_H */
diff --git a/drivers/infiniband/sw/rdmavt/cq.c b/drivers/infiniband/sw/rdmavt/cq.c
index 7aa7a4e312f1..0ae2ff8cf81e 100644
--- a/drivers/infiniband/sw/rdmavt/cq.c
+++ b/drivers/infiniband/sw/rdmavt/cq.c
@@ -50,6 +50,7 @@
#include <linux/kthread.h>
#include "cq.h"
#include "vt.h"
+#include "trace.h"
/**
* rvt_cq_enter - add a new entry to the completion queue
@@ -93,6 +94,7 @@ void rvt_cq_enter(struct rvt_cq *cq, struct ib_wc *entry, bool solicited)
}
return;
}
+ trace_rvt_cq_enter(cq, entry, head);
if (cq->ip) {
wc->uqueue[head].wr_id = entry->wr_id;
wc->uqueue[head].status = entry->status;
@@ -482,6 +484,7 @@ int rvt_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *entry)
if (tail == wc->head)
break;
/* The kernel doesn't need a RMB since it has the lock. */
+ trace_rvt_cq_poll(cq, &wc->kqueue[tail], npolled);
*entry = wc->kqueue[tail];
if (tail >= cq->ibcq.cqe)
tail = 0;
diff --git a/drivers/infiniband/sw/rdmavt/mad.c b/drivers/infiniband/sw/rdmavt/mad.c
index bba241faca61..d6981dc04adb 100644
--- a/drivers/infiniband/sw/rdmavt/mad.c
+++ b/drivers/infiniband/sw/rdmavt/mad.c
@@ -160,7 +160,7 @@ void rvt_free_mad_agents(struct rvt_dev_info *rdi)
ib_unregister_mad_agent(agent);
}
if (rvp->sm_ah) {
- ib_destroy_ah(&rvp->sm_ah->ibah);
+ rdma_destroy_ah(&rvp->sm_ah->ibah);
rvp->sm_ah = NULL;
}
diff --git a/drivers/infiniband/sw/rdmavt/mcast.c b/drivers/infiniband/sw/rdmavt/mcast.c
index 05c8c2afb0e3..1f12b69a0d07 100644
--- a/drivers/infiniband/sw/rdmavt/mcast.c
+++ b/drivers/infiniband/sw/rdmavt/mcast.c
@@ -100,10 +100,11 @@ static void rvt_mcast_qp_free(struct rvt_mcast_qp *mqp)
/**
* mcast_alloc - allocate the multicast GID structure
* @mgid: the multicast GID
+ * @lid: the muilticast LID (host order)
*
* A list of QPs will be attached to this structure.
*/
-static struct rvt_mcast *rvt_mcast_alloc(union ib_gid *mgid)
+static struct rvt_mcast *rvt_mcast_alloc(union ib_gid *mgid, u16 lid)
{
struct rvt_mcast *mcast;
@@ -111,7 +112,9 @@ static struct rvt_mcast *rvt_mcast_alloc(union ib_gid *mgid)
if (!mcast)
goto bail;
- mcast->mgid = *mgid;
+ mcast->mcast_addr.mgid = *mgid;
+ mcast->mcast_addr.lid = lid;
+
INIT_LIST_HEAD(&mcast->qp_list);
init_waitqueue_head(&mcast->wait);
atomic_set(&mcast->refcount, 0);
@@ -131,15 +134,19 @@ static void rvt_mcast_free(struct rvt_mcast *mcast)
}
/**
- * rvt_mcast_find - search the global table for the given multicast GID
+ * rvt_mcast_find - search the global table for the given multicast GID/LID
+ * NOTE: It is valid to have 1 MLID with multiple MGIDs. It is not valid
+ * to have 1 MGID with multiple MLIDs.
* @ibp: the IB port structure
* @mgid: the multicast GID to search for
+ * @lid: the multicast LID portion of the multicast address (host order)
*
* The caller is responsible for decrementing the reference count if found.
*
* Return: NULL if not found.
*/
-struct rvt_mcast *rvt_mcast_find(struct rvt_ibport *ibp, union ib_gid *mgid)
+struct rvt_mcast *rvt_mcast_find(struct rvt_ibport *ibp, union ib_gid *mgid,
+ u16 lid)
{
struct rb_node *n;
unsigned long flags;
@@ -153,15 +160,18 @@ struct rvt_mcast *rvt_mcast_find(struct rvt_ibport *ibp, union ib_gid *mgid)
mcast = rb_entry(n, struct rvt_mcast, rb_node);
- ret = memcmp(mgid->raw, mcast->mgid.raw,
- sizeof(union ib_gid));
+ ret = memcmp(mgid->raw, mcast->mcast_addr.mgid.raw,
+ sizeof(*mgid));
if (ret < 0) {
n = n->rb_left;
} else if (ret > 0) {
n = n->rb_right;
} else {
- atomic_inc(&mcast->refcount);
- found = mcast;
+ /* MGID/MLID must match */
+ if (mcast->mcast_addr.lid == lid) {
+ atomic_inc(&mcast->refcount);
+ found = mcast;
+ }
break;
}
}
@@ -177,7 +187,8 @@ EXPORT_SYMBOL(rvt_mcast_find);
*
* Return: zero if both were added. Return EEXIST if the GID was already in
* the table but the QP was added. Return ESRCH if the QP was already
- * attached and neither structure was added.
+ * attached and neither structure was added. Return EINVAL if the MGID was
+ * found, but the MLID did NOT match.
*/
static int rvt_mcast_add(struct rvt_dev_info *rdi, struct rvt_ibport *ibp,
struct rvt_mcast *mcast, struct rvt_mcast_qp *mqp)
@@ -195,8 +206,9 @@ static int rvt_mcast_add(struct rvt_dev_info *rdi, struct rvt_ibport *ibp,
pn = *n;
tmcast = rb_entry(pn, struct rvt_mcast, rb_node);
- ret = memcmp(mcast->mgid.raw, tmcast->mgid.raw,
- sizeof(union ib_gid));
+ ret = memcmp(mcast->mcast_addr.mgid.raw,
+ tmcast->mcast_addr.mgid.raw,
+ sizeof(mcast->mcast_addr.mgid));
if (ret < 0) {
n = &pn->rb_left;
continue;
@@ -206,6 +218,11 @@ static int rvt_mcast_add(struct rvt_dev_info *rdi, struct rvt_ibport *ibp,
continue;
}
+ if (tmcast->mcast_addr.lid != mcast->mcast_addr.lid) {
+ ret = EINVAL;
+ goto bail;
+ }
+
/* Search the QP list to see if this is already there. */
list_for_each_entry_rcu(p, &tmcast->qp_list, list) {
if (p->qp == mqp->qp) {
@@ -276,7 +293,7 @@ int rvt_attach_mcast(struct ib_qp *ibqp, union ib_gid *gid, u16 lid)
* Allocate data structures since its better to do this outside of
* spin locks and it will most likely be needed.
*/
- mcast = rvt_mcast_alloc(gid);
+ mcast = rvt_mcast_alloc(gid, lid);
if (!mcast)
return -ENOMEM;
@@ -296,6 +313,10 @@ int rvt_attach_mcast(struct ib_qp *ibqp, union ib_gid *gid, u16 lid)
/* Exceeded the maximum number of mcast groups. */
ret = -ENOMEM;
goto bail_mqp;
+ case EINVAL:
+ /* Invalid MGID/MLID pair */
+ ret = -EINVAL;
+ goto bail_mqp;
default:
break;
}
@@ -344,14 +365,20 @@ int rvt_detach_mcast(struct ib_qp *ibqp, union ib_gid *gid, u16 lid)
}
mcast = rb_entry(n, struct rvt_mcast, rb_node);
- ret = memcmp(gid->raw, mcast->mgid.raw,
- sizeof(union ib_gid));
- if (ret < 0)
+ ret = memcmp(gid->raw, mcast->mcast_addr.mgid.raw,
+ sizeof(*gid));
+ if (ret < 0) {
n = n->rb_left;
- else if (ret > 0)
+ } else if (ret > 0) {
n = n->rb_right;
- else
+ } else {
+ /* MGID/MLID must match */
+ if (mcast->mcast_addr.lid != lid) {
+ spin_unlock_irq(&ibp->lock);
+ return -EINVAL;
+ }
break;
+ }
}
/* Search the QP list. */
diff --git a/drivers/infiniband/sw/rdmavt/mr.c b/drivers/infiniband/sw/rdmavt/mr.c
index ae30b6838d79..aa5f9ea318e4 100644
--- a/drivers/infiniband/sw/rdmavt/mr.c
+++ b/drivers/infiniband/sw/rdmavt/mr.c
@@ -191,8 +191,9 @@ static int rvt_alloc_lkey(struct rvt_mregion *mr, int dma_region)
tmr = rcu_access_pointer(dev->dma_mr);
if (!tmr) {
- rcu_assign_pointer(dev->dma_mr, mr);
mr->lkey_published = 1;
+ /* Insure published written first */
+ rcu_assign_pointer(dev->dma_mr, mr);
rvt_get_mr(mr);
}
goto success;
@@ -224,8 +225,9 @@ static int rvt_alloc_lkey(struct rvt_mregion *mr, int dma_region)
mr->lkey |= 1 << 8;
rkt->gen++;
}
- rcu_assign_pointer(rkt->table[r], mr);
mr->lkey_published = 1;
+ /* Insure published written first */
+ rcu_assign_pointer(rkt->table[r], mr);
success:
spin_unlock_irqrestore(&rkt->lock, flags);
out:
@@ -253,23 +255,24 @@ static void rvt_free_lkey(struct rvt_mregion *mr)
spin_lock_irqsave(&rkt->lock, flags);
if (!lkey) {
if (mr->lkey_published) {
- RCU_INIT_POINTER(dev->dma_mr, NULL);
+ mr->lkey_published = 0;
+ /* insure published is written before pointer */
+ rcu_assign_pointer(dev->dma_mr, NULL);
rvt_put_mr(mr);
}
} else {
if (!mr->lkey_published)
goto out;
r = lkey >> (32 - dev->dparms.lkey_table_size);
- RCU_INIT_POINTER(rkt->table[r], NULL);
+ mr->lkey_published = 0;
+ /* insure published is written before pointer */
+ rcu_assign_pointer(rkt->table[r], NULL);
}
- mr->lkey_published = 0;
freed++;
out:
spin_unlock_irqrestore(&rkt->lock, flags);
- if (freed) {
- synchronize_rcu();
+ if (freed)
percpu_ref_kill(&mr->refcount);
- }
}
static struct rvt_mr *__rvt_alloc_mr(int count, struct ib_pd *pd)
@@ -405,8 +408,7 @@ struct ib_mr *rvt_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
mr->mr.access_flags = mr_access_flags;
mr->umem = umem;
- if (is_power_of_2(umem->page_size))
- mr->mr.page_shift = ilog2(umem->page_size);
+ mr->mr.page_shift = umem->page_shift;
m = 0;
n = 0;
for_each_sg(umem->sg_head.sgl, sg, umem->nmap, entry) {
@@ -418,8 +420,9 @@ struct ib_mr *rvt_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
goto bail_inval;
}
mr->mr.map[m]->segs[n].vaddr = vaddr;
- mr->mr.map[m]->segs[n].length = umem->page_size;
- trace_rvt_mr_user_seg(&mr->mr, m, n, vaddr, umem->page_size);
+ mr->mr.map[m]->segs[n].length = BIT(umem->page_shift);
+ trace_rvt_mr_user_seg(&mr->mr, m, n, vaddr,
+ BIT(umem->page_shift));
n++;
if (n == RVT_SEGSZ) {
m++;
@@ -822,16 +825,21 @@ int rvt_lkey_ok(struct rvt_lkey_table *rkt, struct rvt_pd *pd,
goto ok;
}
mr = rcu_dereference(rkt->table[sge->lkey >> rkt->shift]);
- if (unlikely(!mr || atomic_read(&mr->lkey_invalid) ||
- mr->lkey != sge->lkey || mr->pd != &pd->ibpd))
+ if (!mr)
goto bail;
+ rvt_get_mr(mr);
+ if (!READ_ONCE(mr->lkey_published))
+ goto bail_unref;
+
+ if (unlikely(atomic_read(&mr->lkey_invalid) ||
+ mr->lkey != sge->lkey || mr->pd != &pd->ibpd))
+ goto bail_unref;
off = sge->addr - mr->user_base;
if (unlikely(sge->addr < mr->user_base ||
off + sge->length > mr->length ||
(mr->access_flags & acc) != acc))
- goto bail;
- rvt_get_mr(mr);
+ goto bail_unref;
rcu_read_unlock();
off += mr->offset;
@@ -867,6 +875,8 @@ int rvt_lkey_ok(struct rvt_lkey_table *rkt, struct rvt_pd *pd,
isge->n = n;
ok:
return 1;
+bail_unref:
+ rvt_put_mr(mr);
bail:
rcu_read_unlock();
return 0;
@@ -922,15 +932,20 @@ int rvt_rkey_ok(struct rvt_qp *qp, struct rvt_sge *sge,
}
mr = rcu_dereference(rkt->table[rkey >> rkt->shift]);
- if (unlikely(!mr || atomic_read(&mr->lkey_invalid) ||
- mr->lkey != rkey || qp->ibqp.pd != mr->pd))
+ if (!mr)
goto bail;
+ rvt_get_mr(mr);
+ /* insure mr read is before test */
+ if (!READ_ONCE(mr->lkey_published))
+ goto bail_unref;
+ if (unlikely(atomic_read(&mr->lkey_invalid) ||
+ mr->lkey != rkey || qp->ibqp.pd != mr->pd))
+ goto bail_unref;
off = vaddr - mr->iova;
if (unlikely(vaddr < mr->iova || off + len > mr->length ||
(mr->access_flags & acc) == 0))
- goto bail;
- rvt_get_mr(mr);
+ goto bail_unref;
rcu_read_unlock();
off += mr->offset;
@@ -966,6 +981,8 @@ int rvt_rkey_ok(struct rvt_qp *qp, struct rvt_sge *sge,
sge->n = n;
ok:
return 1;
+bail_unref:
+ rvt_put_mr(mr);
bail:
rcu_read_unlock();
return 0;
diff --git a/drivers/infiniband/sw/rdmavt/qp.c b/drivers/infiniband/sw/rdmavt/qp.c
index f5ad8d4bfb39..727e81cc2c8f 100644
--- a/drivers/infiniband/sw/rdmavt/qp.c
+++ b/drivers/infiniband/sw/rdmavt/qp.c
@@ -1,5 +1,5 @@
/*
- * Copyright(c) 2016 Intel Corporation.
+ * Copyright(c) 2016, 2017 Intel Corporation.
*
* This file is provided under a dual BSD/GPLv2 license. When using or
* redistributing this file, you may do so under either license.
@@ -117,23 +117,6 @@ const int ib_rvt_state_ops[IB_QPS_ERR + 1] = {
};
EXPORT_SYMBOL(ib_rvt_state_ops);
-/*
- * Translate ib_wr_opcode into ib_wc_opcode.
- */
-const enum ib_wc_opcode ib_rvt_wc_opcode[] = {
- [IB_WR_RDMA_WRITE] = IB_WC_RDMA_WRITE,
- [IB_WR_RDMA_WRITE_WITH_IMM] = IB_WC_RDMA_WRITE,
- [IB_WR_SEND] = IB_WC_SEND,
- [IB_WR_SEND_WITH_IMM] = IB_WC_SEND,
- [IB_WR_RDMA_READ] = IB_WC_RDMA_READ,
- [IB_WR_ATOMIC_CMP_AND_SWP] = IB_WC_COMP_SWAP,
- [IB_WR_ATOMIC_FETCH_AND_ADD] = IB_WC_FETCH_ADD,
- [IB_WR_SEND_WITH_INV] = IB_WC_SEND,
- [IB_WR_LOCAL_INV] = IB_WC_LOCAL_INV,
- [IB_WR_REG_MR] = IB_WC_REG_MR
-};
-EXPORT_SYMBOL(ib_rvt_wc_opcode);
-
static void get_map_page(struct rvt_qpn_table *qpt,
struct rvt_qpn_map *map,
gfp_t gfp)
@@ -1121,14 +1104,15 @@ int rvt_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
goto inval;
if (attr_mask & IB_QP_AV) {
- if (attr->ah_attr.dlid >= be16_to_cpu(IB_MULTICAST_LID_BASE))
+ if (rdma_ah_get_dlid(&attr->ah_attr) >=
+ be16_to_cpu(IB_MULTICAST_LID_BASE))
goto inval;
if (rvt_check_ah(qp->ibqp.device, &attr->ah_attr))
goto inval;
}
if (attr_mask & IB_QP_ALT_PATH) {
- if (attr->alt_ah_attr.dlid >=
+ if (rdma_ah_get_dlid(&attr->alt_ah_attr) >=
be16_to_cpu(IB_MULTICAST_LID_BASE))
goto inval;
if (rvt_check_ah(qp->ibqp.device, &attr->alt_ah_attr))
@@ -1257,7 +1241,7 @@ int rvt_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
if (attr_mask & IB_QP_AV) {
qp->remote_ah_attr = attr->ah_attr;
- qp->s_srate = attr->ah_attr.static_rate;
+ qp->s_srate = rdma_ah_get_static_rate(&attr->ah_attr);
qp->srate_mbps = ib_rate_to_mbps(qp->s_srate);
}
@@ -1270,7 +1254,7 @@ int rvt_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
qp->s_mig_state = attr->path_mig_state;
if (mig) {
qp->remote_ah_attr = qp->alt_ah_attr;
- qp->port_num = qp->alt_ah_attr.port_num;
+ qp->port_num = rdma_ah_get_port_num(&qp->alt_ah_attr);
qp->s_pkey_index = qp->s_alt_pkey_index;
}
}
@@ -1441,7 +1425,8 @@ int rvt_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
attr->timeout = qp->timeout;
attr->retry_cnt = qp->s_retry_cnt;
attr->rnr_retry = qp->s_rnr_retry_cnt;
- attr->alt_port_num = qp->alt_ah_attr.port_num;
+ attr->alt_port_num =
+ rdma_ah_get_port_num(&qp->alt_ah_attr);
attr->alt_timeout = qp->alt_timeout;
init_attr->event_handler = qp->ibqp.event_handler;
@@ -1789,11 +1774,14 @@ static int rvt_post_one_wr(struct rvt_qp *qp,
0);
qp->s_next_psn = wqe->lpsn + 1;
}
- trace_rvt_post_one_wr(qp, wqe);
- if (unlikely(reserved_op))
+ if (unlikely(reserved_op)) {
+ wqe->wr.send_flags |= RVT_SEND_RESERVE_USED;
rvt_qp_wqe_reserve(qp, wqe);
- else
+ } else {
+ wqe->wr.send_flags &= ~RVT_SEND_RESERVE_USED;
qp->s_avail--;
+ }
+ trace_rvt_post_one_wr(qp, wqe);
smp_wmb(); /* see request builders */
qp->s_head = next;
@@ -2069,8 +2057,12 @@ static void rvt_rc_timeout(unsigned long arg)
spin_lock_irqsave(&qp->r_lock, flags);
spin_lock(&qp->s_lock);
if (qp->s_flags & RVT_S_TIMER) {
+ struct rvt_ibport *rvp = rdi->ports[qp->port_num - 1];
+
qp->s_flags &= ~RVT_S_TIMER;
+ rvp->n_rc_timeouts++;
del_timer(&qp->s_timer);
+ trace_rvt_rc_timeout(qp, qp->s_last_psn + 1);
if (rdi->driver_f.notify_restart_rc)
rdi->driver_f.notify_restart_rc(qp,
qp->s_last_psn + 1,
diff --git a/drivers/infiniband/sw/rdmavt/trace.h b/drivers/infiniband/sw/rdmavt/trace.h
index e2d23acb6a7d..bb4b1e710f22 100644
--- a/drivers/infiniband/sw/rdmavt/trace.h
+++ b/drivers/infiniband/sw/rdmavt/trace.h
@@ -1,5 +1,5 @@
/*
- * Copyright(c) 2016 Intel Corporation.
+ * Copyright(c) 2016, 2017 Intel Corporation.
*
* This file is provided under a dual BSD/GPLv2 license. When using or
* redistributing this file, you may do so under either license.
@@ -52,3 +52,5 @@
#include "trace_qp.h"
#include "trace_tx.h"
#include "trace_mr.h"
+#include "trace_cq.h"
+#include "trace_rc.h"
diff --git a/drivers/infiniband/sw/rdmavt/trace_cq.h b/drivers/infiniband/sw/rdmavt/trace_cq.h
new file mode 100644
index 000000000000..a315850aa9bb
--- /dev/null
+++ b/drivers/infiniband/sw/rdmavt/trace_cq.h
@@ -0,0 +1,127 @@
+/*
+ * Copyright(c) 2016 Intel Corporation.
+ *
+ * This file is provided under a dual BSD/GPLv2 license. When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * BSD LICENSE
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * - Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * - Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * - Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ */
+#if !defined(__RVT_TRACE_CQ_H) || defined(TRACE_HEADER_MULTI_READ)
+#define __RVT_TRACE_CQ_H
+
+#include <linux/tracepoint.h>
+#include <linux/trace_seq.h>
+
+#include <rdma/ib_verbs.h>
+#include <rdma/rdmavt_cq.h>
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM rvt_cq
+
+#define wc_opcode_name(opcode) { IB_WC_##opcode, #opcode }
+#define show_wc_opcode(opcode) \
+__print_symbolic(opcode, \
+ wc_opcode_name(SEND), \
+ wc_opcode_name(RDMA_WRITE), \
+ wc_opcode_name(RDMA_READ), \
+ wc_opcode_name(COMP_SWAP), \
+ wc_opcode_name(FETCH_ADD), \
+ wc_opcode_name(LSO), \
+ wc_opcode_name(LOCAL_INV), \
+ wc_opcode_name(REG_MR), \
+ wc_opcode_name(MASKED_COMP_SWAP), \
+ wc_opcode_name(RECV), \
+ wc_opcode_name(RECV_RDMA_WITH_IMM))
+
+#define CQ_PRN \
+"[%s] idx %u wr_id %llx status %u opcode %u,%s length %u qpn %x"
+
+DECLARE_EVENT_CLASS(
+ rvt_cq_entry_template,
+ TP_PROTO(struct rvt_cq *cq, struct ib_wc *wc, u32 idx),
+ TP_ARGS(cq, wc, idx),
+ TP_STRUCT__entry(
+ RDI_DEV_ENTRY(cq->rdi)
+ __field(u64, wr_id)
+ __field(u32, status)
+ __field(u32, opcode)
+ __field(u32, qpn)
+ __field(u32, length)
+ __field(u32, idx)
+ ),
+ TP_fast_assign(
+ RDI_DEV_ASSIGN(cq->rdi)
+ __entry->wr_id = wc->wr_id;
+ __entry->status = wc->status;
+ __entry->opcode = wc->opcode;
+ __entry->length = wc->byte_len;
+ __entry->qpn = wc->qp->qp_num;
+ __entry->idx = idx;
+ ),
+ TP_printk(
+ CQ_PRN,
+ __get_str(dev),
+ __entry->idx,
+ __entry->wr_id,
+ __entry->status,
+ __entry->opcode, show_wc_opcode(__entry->opcode),
+ __entry->length,
+ __entry->qpn
+ )
+);
+
+DEFINE_EVENT(
+ rvt_cq_entry_template, rvt_cq_enter,
+ TP_PROTO(struct rvt_cq *cq, struct ib_wc *wc, u32 idx),
+ TP_ARGS(cq, wc, idx));
+
+DEFINE_EVENT(
+ rvt_cq_entry_template, rvt_cq_poll,
+ TP_PROTO(struct rvt_cq *cq, struct ib_wc *wc, u32 idx),
+ TP_ARGS(cq, wc, idx));
+
+#endif /* __RVT_TRACE_CQ_H */
+
+#undef TRACE_INCLUDE_PATH
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_PATH .
+#define TRACE_INCLUDE_FILE trace_cq
+#include <trace/define_trace.h>
diff --git a/drivers/infiniband/sw/rdmavt/trace_rc.h b/drivers/infiniband/sw/rdmavt/trace_rc.h
new file mode 100644
index 000000000000..995276933a55
--- /dev/null
+++ b/drivers/infiniband/sw/rdmavt/trace_rc.h
@@ -0,0 +1,109 @@
+/*
+ * Copyright(c) 2017 Intel Corporation.
+ *
+ * This file is provided under a dual BSD/GPLv2 license. When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * BSD LICENSE
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * - Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * - Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * - Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ */
+#if !defined(__RVT_TRACE_RC_H) || defined(TRACE_HEADER_MULTI_READ)
+#define __RVT_TRACE_RC_H
+
+#include <linux/tracepoint.h>
+#include <linux/trace_seq.h>
+
+#include <rdma/ib_verbs.h>
+#include <rdma/rdma_vt.h>
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM rvt_rc
+
+DECLARE_EVENT_CLASS(rvt_rc_template,
+ TP_PROTO(struct rvt_qp *qp, u32 psn),
+ TP_ARGS(qp, psn),
+ TP_STRUCT__entry(
+ RDI_DEV_ENTRY(ib_to_rvt(qp->ibqp.device))
+ __field(u32, qpn)
+ __field(u32, s_flags)
+ __field(u32, psn)
+ __field(u32, s_psn)
+ __field(u32, s_next_psn)
+ __field(u32, s_sending_psn)
+ __field(u32, s_sending_hpsn)
+ __field(u32, r_psn)
+ ),
+ TP_fast_assign(
+ RDI_DEV_ASSIGN(ib_to_rvt(qp->ibqp.device))
+ __entry->qpn = qp->ibqp.qp_num;
+ __entry->s_flags = qp->s_flags;
+ __entry->psn = psn;
+ __entry->s_psn = qp->s_psn;
+ __entry->s_next_psn = qp->s_next_psn;
+ __entry->s_sending_psn = qp->s_sending_psn;
+ __entry->s_sending_hpsn = qp->s_sending_hpsn;
+ __entry->r_psn = qp->r_psn;
+ ),
+ TP_printk(
+ "[%s] qpn 0x%x s_flags 0x%x psn 0x%x s_psn 0x%x s_next_psn 0x%x s_sending_psn 0x%x sending_hpsn 0x%x r_psn 0x%x",
+ __get_str(dev),
+ __entry->qpn,
+ __entry->s_flags,
+ __entry->psn,
+ __entry->s_psn,
+ __entry->s_next_psn,
+ __entry->s_sending_psn,
+ __entry->s_sending_hpsn,
+ __entry->r_psn
+ )
+);
+
+DEFINE_EVENT(rvt_rc_template, rvt_rc_timeout,
+ TP_PROTO(struct rvt_qp *qp, u32 psn),
+ TP_ARGS(qp, psn)
+);
+
+#endif /* __RVT_TRACE_RC_H */
+
+#undef TRACE_INCLUDE_PATH
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_PATH .
+#define TRACE_INCLUDE_FILE trace_rc
+#include <trace/define_trace.h>
diff --git a/drivers/infiniband/sw/rdmavt/trace_tx.h b/drivers/infiniband/sw/rdmavt/trace_tx.h
index 0e03173662d8..a613a2223751 100644
--- a/drivers/infiniband/sw/rdmavt/trace_tx.h
+++ b/drivers/infiniband/sw/rdmavt/trace_tx.h
@@ -71,10 +71,20 @@ __print_symbolic(opcode, \
wr_opcode_name(RDMA_READ_WITH_INV), \
wr_opcode_name(LOCAL_INV), \
wr_opcode_name(MASKED_ATOMIC_CMP_AND_SWP), \
- wr_opcode_name(MASKED_ATOMIC_FETCH_AND_ADD))
+ wr_opcode_name(MASKED_ATOMIC_FETCH_AND_ADD), \
+ wr_opcode_name(RESERVED1), \
+ wr_opcode_name(RESERVED2), \
+ wr_opcode_name(RESERVED3), \
+ wr_opcode_name(RESERVED4), \
+ wr_opcode_name(RESERVED5), \
+ wr_opcode_name(RESERVED6), \
+ wr_opcode_name(RESERVED7), \
+ wr_opcode_name(RESERVED8), \
+ wr_opcode_name(RESERVED9), \
+ wr_opcode_name(RESERVED10))
#define POS_PRN \
-"[%s] wr_id %llx qpn %x psn 0x%x lpsn 0x%x length %u opcode 0x%.2x,%s size %u avail %u head %u last %u"
+"[%s] wqe %p wr_id %llx send_flags %x qpn %x qpt %u psn %x lpsn %x ssn %x length %u opcode 0x%.2x,%s size %u avail %u head %u last %u pid %u num_sge %u"
TRACE_EVENT(
rvt_post_one_wr,
@@ -83,7 +93,9 @@ TRACE_EVENT(
TP_STRUCT__entry(
RDI_DEV_ENTRY(ib_to_rvt(qp->ibqp.device))
__field(u64, wr_id)
+ __field(struct rvt_swqe *, wqe)
__field(u32, qpn)
+ __field(u32, qpt)
__field(u32, psn)
__field(u32, lpsn)
__field(u32, length)
@@ -92,11 +104,17 @@ TRACE_EVENT(
__field(u32, avail)
__field(u32, head)
__field(u32, last)
+ __field(u32, ssn)
+ __field(int, send_flags)
+ __field(pid_t, pid)
+ __field(int, num_sge)
),
TP_fast_assign(
RDI_DEV_ASSIGN(ib_to_rvt(qp->ibqp.device))
+ __entry->wqe = wqe;
__entry->wr_id = wqe->wr.wr_id;
__entry->qpn = qp->ibqp.qp_num;
+ __entry->qpt = qp->ibqp.qp_type;
__entry->psn = wqe->psn;
__entry->lpsn = wqe->lpsn;
__entry->length = wqe->length;
@@ -105,20 +123,30 @@ TRACE_EVENT(
__entry->avail = qp->s_avail;
__entry->head = qp->s_head;
__entry->last = qp->s_last;
+ __entry->pid = qp->pid;
+ __entry->ssn = wqe->ssn;
+ __entry->send_flags = wqe->wr.send_flags;
+ __entry->num_sge = wqe->wr.num_sge;
),
TP_printk(
POS_PRN,
__get_str(dev),
+ __entry->wqe,
__entry->wr_id,
+ __entry->send_flags,
__entry->qpn,
+ __entry->qpt,
__entry->psn,
__entry->lpsn,
+ __entry->ssn,
__entry->length,
__entry->opcode, show_wr_opcode(__entry->opcode),
__entry->size,
__entry->avail,
__entry->head,
- __entry->last
+ __entry->last,
+ __entry->pid,
+ __entry->num_sge
)
);