summaryrefslogtreecommitdiffstats
path: root/include/rdma/ib_verbs.h
Commit message (Collapse)AuthorAgeFilesLines
* IB/core: Add OPA MAD core capability flagIra Weiny2015-06-121-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | Add OPA MAD support flags to the core capability immutable flags. In addition add the rdma_cap_opa_mad helper function for core functions to use to detect OPA MAD support. OPA MADs share a common header with IBTA MADs but with some differences for increased performance. Sharing a common header with IBTA MADs allows us to share most of the MAD processing code when dealing with OPA MADs in addition to supporting some IBTA MADs on OPA devices. OPA MADs differ in the following ways: 1) MADs are variable size up to 2K IBTA defined MADs remain fixed at 256 bytes 2) OPA SMPs must carry valid PKeys 3) OPA SMP packets are a different format The MAD stack will use this new functionality to determine if OPA MAD processing should occur on individual device ports. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/mad: Add support for additional MAD info to/from driversIra Weiny2015-06-121-3/+6
| | | | | | | | | | | | | | | | | | | | In order to support alternate sized MADs (and variable sized MADs on OPA devices) add in/out MAD size parameters to the process_mad core call. In addition, add an out_mad_pkey_index to communicate the pkey index the driver wishes the MAD stack to use when sending OPA MAD responses. The out MAD size and the out MAD PKey index are required by the MAD stack to generate responses on OPA devices. Furthermore, the in and out MAD parameters are made generic by specifying them as ib_mad_hdr rather than ib_mad. Drivers are modified as needed and are protected by BUG_ON flags if the MAD sizes passed to them is incorrect. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/core: Add ability for drivers to report an alternate MAD size.Ira Weiny2015-06-121-0/+18
| | | | | | | | | | | | Add max MAD size to the device immutable data set and have all drivers that support MADs report the current IB MAD size (IB_MGMT_MAD_SIZE) to the core. Verify MAD size data in both the MAD core and when reading the immutable data. OPA drivers will report alternate MAD sizes in subsequent patches. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/core: Pass hardware specific data in query_deviceMatan Barak2015-06-121-1/+2
| | | | | | | | | | Vendors should be able to pass vendor specific data to/from user-space via query_device uverb. In order to do this, we need to pass the vendors' specific udata. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/core: Add timestamp_mask and hca_core_clock to query_deviceMatan Barak2015-06-121-0/+2
| | | | | | | | | | | | | | | | | In order to expose timestamp we need to expose two new attributes in query_device to be used for CQ completion time-stamping: timestamp_mask - how many bits are valid in the timestamp, where timestamp values could be 64bits the most. hca_core_clock - timestamp is given in HW cycles, the frequency in KHZ units of the HCA, necessary in order to convert cycles to seconds. This is added both to ib_query_device and its respective uverbs counterpart. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/core: Add CQ creation time-stamping flagMatan Barak2015-06-121-0/+4
| | | | | | | | | Add CQ creation flag which dictates that the created CQ will report completion time-stamp value in the WC. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/core: Change ib_create_cq to use struct ib_cq_init_attrMatan Barak2015-06-121-4/+3Star
| | | | | | | | | | | | Currently, ib_create_cq uses cqe and comp_vecotr instead of the extendible ib_cq_init_attr struct. Earlier patches already changed the vendors to work with ib_cq_init_attr. This patch changes the consumers too. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/core: Change provider's API of create_cq to be extendibleMatan Barak2015-06-121-2/+8
| | | | | | | | | | | | | | | Add a new ib_cq_init_attr structure which contains the previous cqe (minimum number of CQ entries) and comp_vector (completion vector) in addition to a new flags field. All vendors' create_cq callbacks are changed in order to work with the new API. This commit does not change any functionality. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-By: Devesh Sharma <devesh.sharma@avagotech.com> to patch #2 Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/core: Don't advertise SA in RoCE port capabilitiesMoni Shoua2015-06-111-1/+0Star
| | | | | | | | The Subnet Administrator (SA) is not a component of the RoCE spec. Therefore, it should not be a capability of a RoCE port. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/core cleanup: Add const to args - agent_send_responseIra Weiny2015-06-021-4/+5
| | | | | | | | | | | | In order to support constant callers of agent_send_response we add const specifiers to the its pointer arguments. Adjust the call tree accordingly. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/core cleanup: Add const on args - device->process_madIra Weiny2015-06-021-3/+3
| | | | | | | | | | | The process_mad device function declares some parameters as "in". Make those parameters const and adjust the call tree under process_mad in the various drivers accordingly. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/core cleanup: Add const to RDMA helpersIra Weiny2015-06-021-12/+12
| | | | | | | | | | | | | | | | | | | | | | | The ib_device passed to the new RDMA helpers is constant. Declare the ib_device as const in the following functions. rdma_protocol_ib rdma_protocol_roce rdma_protocol_iwarp rdma_ib_or_roce rdma_cap_ib_mad rdma_cap_ib_smi rdma_cap_ib_cm rdma_cap_iw_cm rdma_cap_ib_sa rdma_cap_ib_mcast rdma_cap_af_ib rdma_cap_eth_ah Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
*-. Merge branches 'bart-srp', 'generic-errors', 'ira-cleanups' and 'mwang-v8' ↵Doug Ledford2015-05-201-2/+299
|\ \ | | | | | | | | | into k.o/for-4.2
| | * IB/core: Change rdma_protocol_iboe to roceIra Weiny2015-05-201-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | After discussion upstream, it was agreed to transition the usage of iboe in the kernel to roce. This keeps our terminology consistent with what was finalized in the IBTA Annex 16 and IBTA Annex 17 publications. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | * IB/core: Convert core to use bitfield for capsIra Weiny2015-05-201-16/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove query_protocol callback Use the new Core Capability bits for: rdma_protocol_* rdma_cap_ib_mad rdma_cap_ib_smi rdma_cap_ib_cm rdma_cap_iw_cm rdma_cap_ib_sa rdma_cap_ib_mcast rdma_cap_af_ib rdma_cap_eth_ah Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | * IB/core: Add per port immutable struct to ib_deviceIra Weiny2015-05-201-2/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As of commit 5eb620c81ce3 "IB/core: Add helpers for uncached GID and P_Key searches"; pkey_tbl_len and gid_tbl_len are immutable data which are stored in the ib_device. The per port core capability flags to be added later are also immutable data to be stored in the ib_device object. In preparation for this create a structure for per port immutable data and place the pkey and gid table lengths within this structure. "get_port_immutable" is added as a mandatory device function to allow the drivers to fill in this data. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | * IB/core: Create common start/end port functionsIra Weiny2015-05-181-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously start_port and end_port were defined in 2 places, cache.c and device.c and this prevented their use in other modules. Make these common functions, change the name to reflect the rdma name space, and update existing users. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | * IB/Verbs: Improve docs for rdma-helpersMichael Wang2015-05-181-40/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Increase the level of documentation for the rdma_cap_* helpers introduced by Michael Wang <yun.wang@profitbricks.com>. This patch is loosely based on a patch Michael wrote to enhance the documentation of these functions, but has been significantly modified in terms of verbiage. In addition, the comments were moved from a kernel Documentation/infiniband/ file to being inline in the header file itself for the functions in question. Finally, the documentation was formated in proper kdoc format. Signed-off-by: Michael Wang <yun.wang@profitbricks.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | * IB/Verbs: Use management helper rdma_cap_eth_ah()Michael Wang2015-05-181-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce helper rdma_cap_eth_ah() to help us check if the port of an IB device support Ethernet Address Handler. Signed-off-by: Michael Wang <yun.wang@profitbricks.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | * IB/Verbs: Use management helper rdma_cap_af_ib()Michael Wang2015-05-181-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce helper rdma_cap_af_ib() to help us check if the port of an IB device support Native Infiniband Address. Signed-off-by: Michael Wang <yun.wang@profitbricks.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | * IB/Verbs: Use management helper rdma_cap_read_multi_sge()Michael Wang2015-05-181-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce helper rdma_cap_read_multi_sge() to help us check if the port of an IB device support RDMA Read Multiple Scatter-Gather Entries. Signed-off-by: Michael Wang <yun.wang@profitbricks.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | * IB/Verbs: Use management helper rdma_cap_ib_mcast()Michael Wang2015-05-181-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce helper rdma_cap_ib_mcast() to help us check if the port of an IB device support Infiniband Multicast. Signed-off-by: Michael Wang <yun.wang@profitbricks.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | * IB/Verbs: Use management helper rdma_cap_ib_sa()Michael Wang2015-05-181-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce helper rdma_cap_ib_sa() to help us check if the port of an IB device support Infiniband Subnet Administration. Signed-off-by: Michael Wang <yun.wang@profitbricks.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | * IB/Verbs: Use management helper rdma_cap_iw_cm()Michael Wang2015-05-181-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce helper rdma_cap_iw_cm() to help us check if the port of an IB device support IWARP Communication Manager. Signed-off-by: Michael Wang <yun.wang@profitbricks.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | * IB/Verbs: Use management helper rdma_cap_ib_cm()Michael Wang2015-05-181-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce helper rdma_cap_ib_cm() to help us check if the port of an IB device support Infiniband Communication Manager. Signed-off-by: Michael Wang <yun.wang@profitbricks.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | * IB/Verbs: Use management helper rdma_cap_ib_smi()Michael Wang2015-05-181-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce helper rdma_cap_ib_smi() to help us check if the port of an IB device support Infiniband Subnet Management Interface. Signed-off-by: Michael Wang <yun.wang@profitbricks.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | * IB/Verbs: Use management helper rdma_cap_ib_mad()Michael Wang2015-05-181-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce helper rdma_cap_ib_mad() to help us check if the port of an IB device support Infiniband Management Datagrams. Signed-off-by: Michael Wang <yun.wang@profitbricks.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | * IB/Verbs: Implement raw management helpersMichael Wang2015-05-181-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add raw helpers: rdma_protocol_ib rdma_protocol_iboe rdma_protocol_iwarp rdma_ib_or_iboe (transition, clean up later) To help us detect which technology the port supported. Signed-off-by: Michael Wang <yun.wang@profitbricks.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | * IB/Verbs: Implement new callback query_protocol()Michael Wang2015-05-181-0/+9
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add new callback query_protocol() and implement for each HW. Mapping List: node-type link-layer transport protocol nes RNIC ETH IWARP IWARP amso1100 RNIC ETH IWARP IWARP cxgb3 RNIC ETH IWARP IWARP cxgb4 RNIC ETH IWARP IWARP usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP ocrdma IB_CA ETH IB IBOE mlx4 IB_CA IB/ETH IB IB/IBOE mlx5 IB_CA IB IB IB ehca IB_CA IB IB IB ipath IB_CA IB IB IB mthca IB_CA IB IB IB qib IB_CA IB IB IB Signed-off-by: Michael Wang <yun.wang@profitbricks.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * IB/core, cma: Nice log-friendly string helpersSagi Grimberg2015-05-181-0/+4
|/ | | | | | | | | | Some of us keep revisiting the code to decode enumerations that appear in out logs. Let's borrow the nice logging helpers that exists in xprtrdma and rds for CMA events, IB events and WC statuses. Reviewd-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* Revert "IB/core: Add support for extended query device caps"Yann Droneaud2015-02-061-4/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While commit 7e36ef8205ff ("IB/core: Temporarily disable ex_query_device uverb") is correct as it makes the extended QUERY_DEVICE uverb (which came as part of commit 5a77abf9a97a ("IB/core: Add support for extended query device caps") and commit 860f10a799c8 ("IB/core: Add flags for on demand paging support")) not available to userspace, it doesn't address the initial issue regarding ib_copy_to_udata() [1][2]. Additionally, further discussions around this new uverb seems to conclude it would require a different data structure than the one currently described in <rdma/ib_user_verbs.h> [3]. Both of these issues require a revert of the changes, so this patch partially reverts commit 8cdd312cfed7 ("IB/mlx5: Implement the ODP capability query verb") and commit 860f10a799c8 ("IB/core: Add flags for on demand paging support") and fully reverts commit 5a77abf9a97a ("IB/core: Add support for extended query device caps"). [1] "Re: [PATCH v3 06/17] IB/core: Add support for extended query device caps" http://mid.gmane.org/1418733236.2779.26.camel@opteya.com [2] "Re: [PATCH] IB/core: Temporarily disable ex_query_device uverb" http://mid.gmane.org/1423067503.3030.83.camel@opteya.com [3] "RE: [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask" http://mid.gmane.org/2807E5FD2F6FDA4886F6618EAC48510E0CC12C30@CRSMSX101.amr.corp.intel.com Cc: Eli Cohen <eli@mellanox.com> Cc: Haggai Eran <haggaie@mellanox.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* IB/core: Implement support for MMU notifiers regarding on demand paging regionsHaggai Eran2014-12-161-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add an interval tree implementation for ODP umems. Create an interval tree for each ucontext (including a count of the number of ODP MRs in this context, semaphore, etc.), and register ODP umems in the interval tree. * Add MMU notifiers handling functions, using the interval tree to notify only the relevant umems and underlying MRs. * Register to receive MMU notifier events from the MM subsystem upon ODP MR registration (and unregister accordingly). * Add a completion object to synchronize the destruction of ODP umems. * Add mechanism to abort page faults when there's a concurrent invalidation. The way we synchronize between concurrent invalidations and page faults is by keeping a counter of currently running invalidations, and a sequence number that is incremented whenever an invalidation is caught. The page fault code checks the counter and also verifies that the sequence number hasn't progressed before it updates the umem's page tables. This is similar to what the kvm module does. In order to prevent the case where we register a umem in the middle of an ongoing notifier, we also keep a per ucontext counter of the total number of active mmu notifiers. We only enable new umems when all the running notifiers complete. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Yuval Dagan <yuvalda@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* IB/core: Add support for on demand paging regionsShachar Raindel2014-12-161-0/+2
| | | | | | | | | | | | | | | | | | | | | * Extend the umem struct to keep the ODP related data. * Allocate and initialize the ODP related information in the umem (page_list, dma_list) and freeing as needed in the end of the run. * Store a reference to the process PID struct in the ucontext. Used to safely obtain the task_struct and the mm during fault handling, without preventing the task destruction if needed. * Add 2 helper functions: ib_umem_odp_map_dma_pages and ib_umem_odp_unmap_dma_pages. These functions get the DMA addresses of specific pages of the umem (and, currently, pin them). * Support for page faults only - IB core will keep the reference on the pages used and call put_page when freeing an ODP umem area. Invalidations support will be added in a later patch. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* IB/core: Add flags for on demand paging supportSagi Grimberg2014-12-161-2/+26
| | | | | | | | | | | | | | | | | | | | * Add a configuration option for enable on-demand paging support in the infiniband subsystem (CONFIG_INFINIBAND_ON_DEMAND_PAGING). In a later patch, this configuration option will select the MMU_NOTIFIER configuration option to enable mmu notifiers. * Add a flag for on demand paging (ODP) support in the IB device capabilities. * Add a flag to request ODP MR in the access flags to reg_mr. * Fail registrations done with the ODP flag when the low-level driver doesn't support this. * Change the conditions in which an MR will be writable to explicitly specify the access flags. This is to avoid making an MR writable just because it is an ODP MR. * Add a ODP capabilities to the extended query device verb. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* IB/core: Add support for extended query device capsEli Cohen2014-12-161-1/+4
| | | | | | | | | | | Add extensible query device capabilities verb to allow adding new features. ib_uverbs_ex_query_device is added and copy_query_dev_fields is used to copy capability fields to be used by both ib_uverbs_query_device and ib_uverbs_ex_query_device. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* IB/mlx5, iser, isert: Add Signature API additionsSagi Grimberg2014-10-091-18/+14Star
| | | | | | | | | | | | | | | | | | | | | | | | Expose more signature setting parameters. We modify the signature API to allow usage of some new execution parameters relevant to data integrity feature. This patch modifies ib_sig_domain structure by: - Deprecate DIF type in signature API (operation will be determined by the parameters alone, no DIF type awareness) - Add APPTAG check bitmask (for input domain) - Add REFTAG remap (increment) flag for each domain - Add APPTAG/REFTAG escape options for each domain The mlx5 driver is modified to follow the new parameters in HW signature setup. At the moment the callers (iser/isert) hard-code new parameters (by DIF type). In the future, callers will retrieve them from the scsi command structure. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* IB/core: Add user MR re-registration supportMatan Barak2014-08-021-1/+9
| | | | | | | | | | | | Memory re-registration is a feature that enables changing the attributes of a memory region registered by user-space, including PD, translation (address and length) and access flags. Add the required support in uverbs and the kernel verbs API. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
*-. Merge branches 'core', 'cxgb3', 'cxgb4', 'iser', 'iwpm', 'misc', 'mlx4', ↵Roland Dreier2014-06-101-5/+6
|\ \ | | | | | | | | | 'mlx5', 'noio', 'ocrdma', 'qib', 'srp' and 'usnic' into for-next
| | * IB: Add a QP creation flag to use GFP_NOIO allocationsOr Gerlitz2014-06-021-0/+1
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This addresses a problem where NFS client writes over IPoIB connected mode may deadlock on memory allocation/writeback. The problem is not directly memory reclamation. There is an indirect dependency between network filesystems writing back pages and ipoib_cm_tx_init() due to how a kworker is used. Page reclaim cannot make forward progress until ipoib_cm_tx_init() succeeds and it is stuck in page reclaim itself waiting for network transmission. Ordinarily this situation may be avoided by having the caller use GFP_NOFS but ipoib_cm_tx_init() does not have that information. To address this, take a general approach and add a new QP creation flag that tells the low-level hardware driver to use GFP_NOIO for the memory allocations related to the new QP. Use the new flag in the ipoib connected mode path, and if the driver doesn't support it, re-issue the QP creation without the flag. Signed-off-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| * IB/core: Fix sparse warnings about redeclared functionsRoland Dreier2014-06-041-5/+5
|/ | | | | | | | | | | | | Fix a few functions that are declared with __attribute_const__ in the ib_verbs.h header file but defined without it in verbs.c. This gets rid of the following sparse warnings: drivers/infiniband/core/verbs.c:51:5: error: symbol 'ib_rate_to_mult' redeclared with different type (originally declared at include/rdma/ib_verbs.h:469) - different modifiers drivers/infiniband/core/verbs.c:68:14: error: symbol 'mult_to_ib_rate' redeclared with different type (originally declared at include/rdma/ib_verbs.h:607) - different modifiers drivers/infiniband/core/verbs.c:85:5: error: symbol 'ib_rate_to_mbps' redeclared with different type (originally declared at include/rdma/ib_verbs.h:476) - different modifiers drivers/infiniband/core/verbs.c:111:1: error: symbol 'rdma_node_get_transport' redeclared with different type (originally declared at include/rdma/ib_verbs.h:84) - different modifiers Signed-off-by: Roland Dreier <roland@purestorage.com>
*-. Merge branches 'core', 'cxgb4', 'ip-roce', 'iser', 'misc', 'mlx4', 'nes', ↵Roland Dreier2014-04-031-8/+6Star
|\ \ | | | | | | | | | 'ocrdma', 'qib', 'sgwrapper', 'srp' and 'usnic' into for-next
| | * IB/core: Remove overload in ib_sg_dma*Mike Marciniszyn2014-04-011-8/+6Star
| |/ | | | | | | | | | | | | | | | | | | | | The code is replaced by driver specific changes and avoids the pointer NULL test for drivers that don't overload these operations. Suggested-by: <Bart Van Assche <bvanassche@acm.org> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Tested-by: Vinod Kumar <vinod.kumar@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* | IB/core: Introduce signature verbs APISagi Grimberg2014-03-071-1/+148
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a verbs interface for signature-related operations. A signature handover operation configures the layouts of data and protection attributes both in memory and wire domains. Signature operations are: - INSERT: Generate and insert protection information when handing over data from input space to output space. - validate and STRIP: Validate protection information and remove it when handing over data from input space to output space. - validate and PASS: Validate protection information and pass it when handing over data from input space to output space. Once the signature handover opration is done, the HCA will offload data integrity generation/validation while performing the actual data transfer. Additions: 1. HCA signature capabilities in device attributes Verbs provider supporting signature handover operations fills relevant fields in device attributes structure returned by ib_query_device. 2. QP creation flag IB_QP_CREATE_SIGNATURE_EN Creating a QP that will carry signature handover operations may require some special preparations from the verbs provider. So we add QP creation flag IB_QP_CREATE_SIGNATURE_EN to declare that the created QP may carry out signature handover operations. Expose signature support to verbs layer (no support for now). 3. New send work request IB_WR_REG_SIG_MR Signature handover work request. This WR will define the signature handover properties of the memory/wire domains as well as the domains layout. The purpose of this work request is to bind all the needed information for the signature operation: - data to be transferred: wr->sg_list (ib_sge). * The raw data, pre-registered to a single MR (normally, before signature, this MR would have been used directly for the data transfer) - data protection guards: sig_handover.prot (ib_sge). * The data protection buffer, pre-registered to a single MR, which contains the data integrity guards of the raw data blocks. Note that it may not always exist, only in cases where the user is interested in storing protection guards in memory. - signature operation attributes: sig_handover.sig_attrs. * Tells the HCA how to validate/generate the protection information. Once the work request is executed, the memory region that will describe the signature transaction will be the sig_mr. The application can now go ahead and send the sig_mr.rkey or use the sig_mr.lkey for data transfer. 4. New Verb ib_check_mr_status check_mr_status verb checks the status of the memory region post transaction. The first check that may be used is IB_MR_CHECK_SIG_STATUS, which will indicate if any signature errors are pending for a specific signature-enabled ib_mr. This verb is a lightwight check and is allowed to be taken from interrupt context. An application must call this verb after it is known that the actual data transfer has finished. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* | IB/core: Introduce protected memory regionsSagi Grimberg2014-03-071-0/+38
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | This commit introduces verbs for creating/destoying memory regions which will allow new types of memory key operations such as protected memory registration. Indirect memory registration is registering several (one of more) pre-registered memory regions in a specific layout. The Indirect region may potentialy describe several regions and some repitition format between them. Protected Memory registration is registering a memory region with various data integrity attributes which will describe protection schemes that will be handled by the HCA in an offloaded manner. These memory regions will be applicable for a new REG_SIG_MR work request introduced later in this patchset. In the future these routines may replace or implement current memory regions creation routines existing today: - ib_reg_user_mr - ib_alloc_fast_reg_mr - ib_get_dma_mr - ib_dereg_mr Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* IB: Report using RoCE IP based gids in port capsMoni Shoua2014-02-131-1/+2
| | | | | | | | | | | | For userspace RoCE UD QPs we need to know the GID format that the kernel uses, e.g when working over older kernels. For that end, add a new port capability IB_PORT_IP_BASED_GIDS and report it when query port is issued. Signed-off-by: Moni Shoua <monis@mellanox.co.il> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* Merge branch 'ip-roce' into for-nextRoland Dreier2014-01-231-2/+19
|\ | | | | | | | | Conflicts: drivers/infiniband/hw/mlx4/main.c
| * IB/core: Ethernet L2 attributes in verbs/cm structuresMatan Barak2014-01-141-2/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch add the support for Ethernet L2 attributes in the verbs/cm/cma structures. When dealing with L2 Ethernet, we should use smac, dmac, vlan ID and priority in a similar manner that the IB L2 (and the L4 PKEY) attributes are used. Thus, those attributes were added to the following structures: * ib_ah_attr - added dmac * ib_qp_attr - added smac and vlan_id, (sl remains vlan priority) * ib_wc - added smac, vlan_id * ib_sa_path_rec - added smac, dmac, vlan_id * cm_av - added smac and vlan_id For the path record structure, extra care was taken to avoid the new fields when packing it into wire format, so we don't break the IB CM and SA wire protocol. On the active side, the CM fills. its internal structures from the path provided by the ULP. We add there taking the ETH L2 attributes and placing them into the CM Address Handle (struct cm_av). On the passive side, the CM fills its internal structures from the WC associated with the REQ message. We add there taking the ETH L2 attributes from the WC. When the HW driver provides the required ETH L2 attributes in the WC, they set the IB_WC_WITH_SMAC and IB_WC_WITH_VLAN flags. The IB core code checks for the presence of these flags, and in their absence does address resolution from the ib_init_ah_from_wc() helper function. ib_modify_qp_is_ok is also updated to consider the link layer. Some parameters are mandatory for Ethernet link layer, while they are irrelevant for IB. Vendor drivers are modified to support the new function signature. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| |
| \
*-. \ Merge branches 'cma', 'cxgb4', 'flowsteer', 'ipoib', 'misc', 'mlx4', 'mlx5', ↵Roland Dreier2014-01-231-2/+19
|\ \ \ | |_|/ |/| | | | | 'ocrdma', 'qib', 'srp' and 'usnic' into for-next
| | * IB/core: Add support for RDMA_NODE_USNIC_UDPUpinder Malhi2014-01-181-0/+1
| | | | | | | | | | | | | | | | | | | | | Add the complementary RDMA_NODE_USNIC_UDP for RDMA_TRANSPORT_USNIC_UDP. Signed-off-by: Upinder Malhi <umalhi@cisco.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| | * IB/core: Add RDMA_TRANSPORT_USNIC_UDPUpinder Malhi2014-01-141-1/+2
| |/ |/| | | | | | | | | | | Add RDMA_TRANSPORT_USNIC_UDP which will be used by usNIC. Signed-off-by: Upinder Malhi <umalhi@cisco.com> Signed-off-by: Roland Dreier <roland@purestorage.com>