summaryrefslogtreecommitdiffstats
path: root/src/net/infiniband
Commit message (Collapse)AuthorAgeFilesLines
* [block] Describe all SAN devices via ACPI tablesMichael Brown2017-03-281-41/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Describe all SAN devices via ACPI tables such as the iBFT. For tables that can describe only a single device (i.e. the aBFT and sBFT), one table is installed per device. For multi-device tables (i.e. the iBFT), all devices are described in a single table. An underlying SAN device connection may be closed at the time that we need to construct an ACPI table. We therefore introduce the concept of an "ACPI descriptor" which enables the SAN boot code to maintain an opaque pointer to the underlying object, and an "ACPI model" which can build tables from a list of such descriptors. This separates the lifecycles of ACPI descriptions from the lifecycles of the block device interfaces, and allows for construction of the ACPI tables even if the block device interface has been closed. For a multipath SAN device, iPXE will wait until sufficient information is available to describe all devices but will not wait for all paths to connect successfully. For example: with a multipath iSCSI boot iPXE will wait until at least one path has become available and name resolution has completed on all other paths. We do this since the iBFT has to include IP addresses rather than DNS names. We will commence booting without waiting for the inactive paths to either become available or close; this avoids unnecessary boot delays. Note that the Linux kernel will refuse to accept an iBFT with more than two NIC or target structures. We therefore describe only the NICs that are actually required in order to reach the described targets. Any iBFT with at most two targets is therefore guaranteed to describe at most two NICs. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Return status code from ib_create_mi()Michael Brown2017-03-221-6/+10
| | | | Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Return status code from ib_create_cq() and ib_create_qp()Michael Brown2017-03-222-21/+19Star
| | | | | | | | | | | | Any underlying errors arising during ib_create_cq() or ib_create_qp() are lost since the functions simply return NULL on error. This makes debugging harder, since a debug-enabled build is required to discover the root cause of the error. Fix by returning a status code from these functions, thereby allowing any underlying errors to be propagated. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [netdevice] Limit MTU by hardware maximum frame lengthMichael Brown2017-01-251-0/+1
| | | | | | | | | | | | | | | | Separate out the concept of "hardware maximum supported frame length" and "configured link MTU", and limit the latter according to the former. In networks where the DHCP-supplied link MTU is inconsistent with the hardware or driver capabilities (e.g. a network using jumbo frames), this will result in iPXE advertising a TCP MSS consistent with a size that can actually be received. Note that the term "MTU" is typically used to refer to the maximum length excluding the link-layer headers; we adopt this usage. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [xsigo] Add support for Xsigo virtual Ethernet (XVE) EoIB devicesMichael Brown2016-03-091-0/+1858
| | | | | | | | | Add support for EoIB devices as implemented by Xsigo. Based on the public (but out-of-tree) Linux kernel drivers at https://oss.oracle.com/git/?p=linux-uek.git;a=log;h=v4.1.12-32.2.1 Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Retrieve GID flag from cached path entriesMichael Brown2016-03-081-0/+1
| | | | Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Assign names to queue pairsMichael Brown2016-03-082-3/+5
| | | | Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Assign names to CMRC connectionsMichael Brown2016-03-082-36/+50
| | | | Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Allow for the creation of multicast groupsMichael Brown2016-03-081-9/+16
| | | | Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Parse MLID, rate, and SL from multicast membership recordMichael Brown2016-03-081-27/+30
| | | | Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Record multicast GID attachment as part of group membershipMichael Brown2016-03-081-1/+9
| | | | Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Do not use GRH for local pathsMichael Brown2016-03-081-7/+6Star
| | | | | | | Avoid including an unnecessary GRH in packets sent to unicast destinations within the local subnet. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Use correct transaction identifier in CM responsesMichael Brown2016-03-082-16/+24
| | | | Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Use connection's local ID as debug message identifierMichael Brown2016-03-082-25/+33
| | | | Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Use "%d" as format specifier for LIDsMichael Brown2016-03-082-4/+4
| | | | Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Use "%#lx" as format specifier for queue pair numbersMichael Brown2016-03-084-12/+12
| | | | Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Assign names to Infiniband devices for debug messagesMichael Brown2016-03-086-63/+64
| | | | Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Add support for performing service record lookupsMichael Brown2016-03-081-0/+67
| | | | Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Avoid multiple calls to ib_cmrc_shutdown()Michael Brown2016-03-081-0/+3
| | | | | | | | | | | | | | | | | | | When a CMRC connection is closed, the deferred shutdown process calls ib_destroy_qp(). This will cause the receive work queue entries to complete in error (since they are being cancelled), which will in turn reschedule the deferred shutdown process. This eventually leads to ib_destroy_conn() being called on a connection that has already been freed. Fix by explicitly cancelling any pending shutdown process after the shutdown process has completed. Ironically, this almost exactly reverts commit 019d4c1 ("[infiniband] Use a one-shot process for CMRC shutdown"); prior to the introduction of one-shot processes the only way to achieve a one-shot process was for the process to cancel itself. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [ipoib] Fix a race when chain-loading undionly.kpxe in IPoIBWissam Shoukair2015-08-171-0/+6
| | | | | | | | | | | | | | | The Infiniband link status change callback ipoib_link_state_changed() may be called while the IPoIB device is closed, in which case there will not be an IPoIB queue pair to be joined to the IPv4 broadcast group. This leads to NULL pointer dereferences in ib_mcast_attach() and ib_mcast_detach(). Fix by not attempting to join (or leave) the broadcast group unless we actually have an IPoIB queue pair. Signed-off-by: Wissam Shoukair <wissams@mellanox.com> Modified-by: Michael Brown <mcb30@ipxe.org> Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [base16] Add buffer size parameter to base16_encode() and base16_decode()Michael Brown2015-04-241-1/+1
| | | | | | | | | | | | | | The current API for Base16 (and Base64) encoding requires the caller to always provide sufficient buffer space. This prevents the use of the generic encoding/decoding functionality in some situations, such as in formatting the hex setting types. Implement a generic hex_encode() (based on the existing format_hex_setting()), implement base16_encode() and base16_decode() in terms of the more generic hex_encode() and hex_decode(), and update all callers to provide the additional buffer length parameter. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [legal] Relicense files under GPL2_OR_LATER_OR_UBDLMichael Brown2015-03-027-7/+35
| | | | | | | Relicense files for which I am the sole author (as identified by util/relicense.pl). Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Include destination address vector in ib_complete_recv()Michael Brown2012-08-313-10/+18
| | | | Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Use explicit "source" and "dest" address vector parameter namesMichael Brown2012-08-313-24/+27
| | | | Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Allow queue pairs to have a custom allocator for receive iobufsMichael Brown2012-08-312-2/+14
| | | | Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [legal] Update FSF mailing address in GPL licence textsMichael Brown2012-07-207-7/+14
| | | | | Suggested-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Use a one-shot process for CMRC shutdownMichael Brown2011-06-281-4/+2Star
| | | | Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [process] Pass containing object pointer to process step() methodsMichael Brown2011-06-281-5/+7
| | | | | | | | | Give the step() method a pointer to the containing object, rather than a pointer to the process. This is consistent with the operation of interface methods, and allows a single function to serve as both an interface method and a process step() method. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Send xfer_window_changed() when CMRC connection is establishedMichael Brown2011-06-281-0/+3
| | | | Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Add support for identifying the underlying hardware deviceMichael Brown2010-09-221-0/+13
| | | | Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Add node GUID as distinct from the first port GUIDMichael Brown2010-09-163-33/+76
| | | | | | | | | | | | | | | iPXE currently uses the first port's port GUID as the node GUID, rather than using the (possibly distinct) real node GUID. This can confuse opensm during the handover to a loaded OS: it thinks the port already belongs to a different node and so discards our port information with a warning message about duplicate ports. Everything is picked up correctly on the second subnet sweep, after opensm has established that the "old" node no longer exists, but this can delay link-up unnecessarily by several seconds. Fix by using the real node GUID. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Always call ib_link_state_changed() in ib_smc_update()Michael Brown2010-09-161-2/+39
| | | | | | | | | | | | | | ib_smc_update() potentially updates the Infiniband port state, and so should almost always be followed by a call to ib_link_state_changed(). The one exception is the call made to ib_smc_update() before the device is registered. Fix by removing explicit calls to ib_link_state_changed() from drivers using ib_smc_update(), including a call to ib_link_state_changed() within ib_smc_update(), and creating a separate ib_smc_init() for use prior to device registration. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Match GID/GUID terminology as used in the IBAMichael Brown2010-09-158-89/+63Star
| | | | Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [block] Replace gPXE block-device API with an iPXE asynchronous interfaceMichael Brown2010-09-141-85/+260
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The block device interface used in gPXE predates the invention of even the old gPXE data-transfer interface, let alone the current iPXE generic asynchronous interface mechanism. Bring this old code up to date, with the following benefits: o Block device commands can be cancelled by the requestor. The INT 13 layer uses this to provide a global timeout on all INT 13 calls, with the result that an unexpected passive failure mode (such as an iSCSI target ACKing the request but never sending a response) will lead to a timeout that gets reported back to the INT 13 user, rather than simply freezing the system. o INT 13,00 (reset drive) is now able to reset the underlying block device. INT 13 users, such as DOS, that use INT 13,00 as a method for error recovery now have a chance of recovering. o All block device commands are tagged, with a numerical tag that will show up in debugging output and in packet captures; this will allow easier interpretation of bug reports that include both sources of information. o The extremely ugly hacks used to generate the boot firmware tables have been eradicated and replaced with a generic acpi_describe() method (exploiting the ability of iPXE interfaces to pass through methods to an underlying interface). The ACPI tables are now built in a shared data block within .bss16, rather than each requiring dedicated space in .data16. o The architecture-independent concept of a SAN device has been exposed to the iPXE core through the sanboot API, which provides calls to hook, unhook, boot, and describe SAN devices. This allows for much more flexible usage patterns (such as hooking an empty SAN device and then running an OS installer via TFTP). Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Respond to CM disconnection requestsMichael Brown2010-09-121-68/+153
| | | | Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Fix TID magic signatureMichael Brown2010-09-121-1/+1
| | | | Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [retry] Hold reference while timer is running and during expiry callbackMichael Brown2010-09-031-1/+1
| | | | | | | | | Guarantee that a retry timer cannot go out of scope while the timer is running, and provide a guarantee to the expiry callback that the timer will remain in scope during the entire callback (similar to the guarantee provided to interface methods). Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [interface] Convert all data-xfer interfaces to generic interfacesMichael Brown2010-06-221-37/+21Star
| | | | | | | | | | | | | | Remove data-xfer as an interface type, and replace data-xfer interfaces with generic interfaces supporting the data-xfer methods. Filter interfaces (as used by the TLS layer) are handled using the generic pass-through interface capability. A side-effect of this is that deliver_raw() no longer exists as a data-xfer method. (In practice this doesn't lose any efficiency, since there are no instances within the current codebase where xfer_deliver_raw() is used to pass data to an interface supporting the deliver_raw() method.) Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [retry] Add timer_init() wrapper functionMichael Brown2010-06-221-1/+1
| | | | | | | Standardise on using timer_init() to initialise an embedded retry timer, to match the coding style used by other embedded objects. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [refcnt] Add ref_init() wrapper functionMichael Brown2010-06-221-0/+1
| | | | | | | Standardise on using ref_init() to initialise an embedded reference count, to match the coding style used by other embedded objects. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [libc] Enable automated extraction of error usage reportsMichael Brown2010-05-311-4/+9
| | | | | | | Add preprocessor magic to the error definitions to enable every error usage to be tracked. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Use generic base16 functions for SRPMichael Brown2010-05-281-9/+7Star
| | | | Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [build] Rename gPXE to iPXEMichael Brown2010-04-209-35/+35
| | | | | | | | | | | Access to the gpxe.org and etherboot.org domains and associated resources has been revoked by the registrant of the domain. Work around this problem by renaming project from gPXE to iPXE, and updating URLs to match. Also update README, LOG and COPYRIGHTS to remove obsolete information. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [infiniband] Rename IB_PKEY_NONE to IB_PKEY_DEFAULTMichael Brown2009-11-161-1/+1
| | | | There is no such thing as a non-existent partition.
* [infiniband] Include hostname in node description, if availableMichael Brown2009-11-161-1/+7
|
* [infiniband] Make node description invariant across all portsMichael Brown2009-11-161-4/+5
| | | | | | | IBA section 14.2.5.2 states that "the contents of the NodeDescription attribute are the same for all ports on a node". Satisfy this by using the HCA GUID rather than the port GUID to form the node description string.
* [infiniband] Send CM requests to target node's GSI rather than SM's GSIMichael Brown2009-10-171-1/+3
|
* [infiniband] Disambiguate CM connection rejection reasonsMichael Brown2009-08-102-4/+27
| | | | | | | | | | | | | There is diagnostic value in being able to disambiguate between the various reasons why an IB CM has rejected a connection attempt. In particular, reason 8 "invalid service ID" can be used to identify an incorrect SRP service_id root-path component, and reason 28 "consumer reject" corresponds to a genuine SRP login rejection IU, which can be passed up to the SRP layer. For rejection reasons other than "consumer reject", we should not pass through the private data, since it is most likely generated by the CM without any protocol-specific knowledge.
* [infiniband] Generate more specific errors in response to failure MADsMichael Brown2009-08-104-6/+8
| | | | | | | Generate errors within individual MAD transaction consumers such as ib_pathrec.c and ib_mcast.c, rather than within ib_mi.c. This allows for more meaningful error messages to eventually be displayed to the user.
* [infiniband] Add support for SRP over InfinibandMichael Brown2009-08-101-0/+406
| | | | | | | | SRP is the SCSI RDMA Protocol. It allows for a method of SAN booting whereby the target is responsible for reading and writing data using Remote DMA directly to the initiator's memory. The software initiator merely sends and receives SCSI commands; it never has to touch the actual data.