|
NAMEfi_rxm - The RxM (RDM over MSG) Utility ProviderOVERVIEWThe RxM provider (ofi_rxm) is an utility provider that supports FI_EP_RDM type endpoint emulated over FI_EP_MSG type endpoint(s) of an underlying core provider. FI_EP_RDM endpoints have a reliable datagram interface and RxM emulates this by hiding the connection management of underlying FI_EP_MSG endpoints from the user. Additionally, RxM can hide memory registration requirement from a core provider like verbs if the apps don’t support it.REQUIREMENTSRequirements for core providerRxM provider requires the core provider to support the following features:
Requirements for applicationsSince RxM emulates RDM endpoints by hiding connection management and connections are established only on-demand (when app tries to send data), the first several data transfer calls would return EAGAIN. Applications should be aware of this and retry until the operation succeeds.If an application has chosen manual progress for data progress, it should also read the CQ so that the connection establishment progresses. Not doing so would result in a stall. See also the ERRORS section in fi_msg(3). SUPPORTED FEATURESThe RxM provider currently supports FI_MSG, FI_TAGGED, FI_RMA and FI_ATOMIC capabilities.
LIMITATIONSWhen using RxM provider, some limitations from the underlying MSG provider could also show up. Please refer to the corresponding MSG provider man pages to find about those limitations.Unsupported featuresRxM provider does not support the following features:
Progress limitationsWhen sending large messages, an app doing an sread or waiting on the CQ file descriptor may not get a completion when reading the CQ after being woken up from the wait. The app has to do sread or wait on the file descriptor again. This is needed because RxM uses a rendezvous protocol for large message sends. An app would get woken up from waiting on CQ fd when rendezvous protocol request completes but it would have to wait again to get an ACK from the receiver indicating completion of large message transfer by remote RMA read.FI_ATOMIC limitationsThe FI_ATOMIC capability will only be listed in the fi_info if the fi_info hints parameter specifies FI_ATOMIC. If FI_ATOMIC is requested, message order FI_ORDER_RAR, FI_ORDER_RAW, FI_ORDER_WAR, FI_ORDER_WAW, FI_ORDER_SAR, and FI_ORDER_SAW can not be supported.Miscellaneous limitations
RUNTIME PARAMETERSThe ofi_rxm provider checks for the following environment variables.
TuningBandwidthTo optimize for bandwidth, ensure you use higher values than default for FI_OFI_RXM_TX_SIZE, FI_OFI_RXM_RX_SIZE, FI_OFI_RXM_MSG_TX_SIZE, FI_OFI_RXM_MSG_RX_SIZE subject to memory limits of the system and the tx and rx sizes supported by the MSG provider.FI_OFI_RXM_SAR_LIMIT is another knob that can be experimented with to optimze for bandwidth. MemoryTo conserve memory, ensure FI_UNIVERSE_SIZE set to what is required. Similarly check that FI_OFI_RXM_TX_SIZE, FI_OFI_RXM_RX_SIZE, FI_OFI_RXM_MSG_TX_SIZE and FI_OFI_RXM_MSG_RX_SIZE env variables are set to only required values.NOTESThe data transfer API may return -FI_EAGAIN during on-demand connection setup of the core provider FI_MSG_EP. See fi_msg(3) for a detailed description of handling FI_EAGAIN.Troubleshooting / Known issuesIf an RxM endpoint is expected to communicate with more peers than the default value of FI_UNIVERSE_SIZE (256) CQ overruns can happen. To avoid this set a higher value for FI_UNIVERSE_SIZE. CQ overrun can make a MSG endpoint unusable.At higher # of ranks, there may be connection errors due to a node running out of memory. The workaround is to use shared receive contexts for the MSG provider (FI_OFI_RXM_USE_SRX=1) or reduce eager message size (FI_OFI_RXM_BUFFER_SIZE) and MSG provider TX/RX queue sizes (FI_OFI_RXM_MSG_TX_SIZE / FI_OFI_RXM_MSG_RX_SIZE). SEE ALSOfabric(7), fi_provider(7), fi_getinfo(3)AUTHORSOpenFabrics.
Visit the GSP FreeBSD Man Page Interface. |