Configuration Guide
Transport Acceleration Options

Transport acceleration options enable kernel-bypass acceleration in conjunction with the following vendor solutions:


Myricom Datagram Bypass Layer (DBL)  <-

DBL is a kernel-bypass technology that accelerates sending and receiving UDP traffic and operates with DBL-enabled Myricom 10-Gigabit Ethernet adapter cards for Linux and Microsoft Windows.

DBL does not support fragmentation and reassembly, so do not send messages larger than the MTU size configured on the DBL interface.

DBL acceleration is compatible with the following Ultra Messaging transport types:

  • LBT-RM (UDP-based reliable multicast)
  • LBT-RU (UDP-based reliable unicast)
  • Multicast Immediate Messaging
  • Multicast Topic Resolution

To use DBL Transport Acceleration, perform the following steps:

  1. Install the Myricom 10-Gigabit Ethernet NIC.
  2. Install the DBL shared library.
  3. Update your search path to include the location of the DBL shared library.
  4. Set option transport_*_datagram_max_size and option resolver_datagram_max_size (context) to a value of no more than 28 bytes smaller than the Myricom interface's configured MTU size.

Users of DBL are advised to make use of Dynamic Fragmentation Reduction.


Reference  <-


dbl_lbtrm_acceleration (context)  <-

Flag indicating if DBL acceleration is enabled for LBT-RM transports.
See Myricom Datagram Bypass Layer (DBL).
Scope: context
Type: int
When to Set: Can only be set during object initialization.
Version: This option was implemented in LBM 4.0.

Value Description
1

DBL acceleration is enabled for LBT-RM.

0 DBL acceleration is not enabled for LBT-RM. Default for all.


dbl_lbtru_acceleration (context)  <-

Flag indicating if DBL acceleration is enabled for LBT-RU transports.
See Myricom Datagram Bypass Layer (DBL).
Scope: context
Type: int
When to Set: Can only be set during object initialization.
Version: This option was implemented in LBM 4.0.

Value Description
1

DBL acceleration is enabled for LBT-RU.

0 DBL acceleration is not enabled for LBT-RU. Default for all.


dbl_mim_acceleration (context)  <-

Flag indicating if DBL acceleration is enabled for MIM.
See Myricom Datagram Bypass Layer (DBL).
See Multicast Immediate Messaging for general information about MIM.
Scope: context
Type: int
When to Set: Can only be set during object initialization.
Version: This option was implemented in LBM 4.0.

Value Description
1

DBL acceleration is enabled for MIM.

0 DBL acceleration is not enabled for MIM. Default for all.


dbl_resolver_acceleration (context)  <-

Flag indicating if DBL acceleration is enabled for topic resolution.
See Myricom Datagram Bypass Layer (DBL).
Scope: context
Type: int
When to Set: Can only be set during object initialization.
Version: This option was implemented in LBM 4.0.

Value Description
1

DBL acceleration is enabled for topic resolution.

0 DBL acceleration is not enabled for topic resolution. Default for all.


Solarflare Onload  <-

In UM documentation, we use the term "Solarflare" for NIC technology that was originally developed by Solarflare Communications Inc. As of this writing, that technology is currently owned by Advanced Micro Devices, Inc (AMD) and sold under their Xilinx brand.

Onload is a kernel-bypass technology available for Linux that accelerates message traffic and operates with Solarflare Ethernet NICs. There is an open-source version of Onload called OpenOnload. Ultra Messaging does not differentiate between the two versions.

Ultra Messaging loads the Onload library dynamically if Onload functionality is specified in the UM configuration. Specifically, the use of any of the following configuration options will lead to UM loading the Onload library:


Onload Stack Names  <-

Onload and Solarflare NICs can support multiple "stacks" which can be used by software to send and receive packets. Different stacks can be used concurrently without interference, which is valuable to latency-sensitive multi-threaded applications. By ensuring that the sockets of a stack are only accessed by a single thread, you can keep latency outliers to a minimum.

Onload defaults to accelerating all sockets within a process on a single stack. But this is not always desired; users often want to accelerate only certain sockets and not others, or assign different sockets to different stacks, depending on their threading needs. The UM configuration options onload_acceleration_stack_name (context), onload_acceleration_stack_name (source), and onload_acceleration_stack_name (receiver), control the stack used by various sockets. These options apply to transport types TCP, LBT-RU, and LBT-RM.

The onload_acceleration_stack_name (source) option controls the Onload stack for the sockets associated with the underlying transport session of a UM source object. Note that the option only applies when the first source object on a given transport session is created. Subsequent sources created on the same transport session do not affect the Onload stack.

Similarly, the onload_acceleration_stack_name (receiver) option controls the Onload stack for the sockets associated with the underlying Transport Sessions of a UM receiver object. Note that unlike a source, a receiver object can be associated with more than one transport session if the topic is published by more than one application instance. If sources come and go, the receiver may join and exit from transport sessions. Note that the stack name option only applies when a receiver object discovers and causes UM to join a transport session. Subsequent receiver objects mapped to the same transport session do not affect the Onload stack. However, when using multiple XSPs, care should be taken to ensure that all transport sessions associate with a given receiver object are handled by the same XSP. Otherwise you can have multiple XSPs handling the same Onload stack, which can introduce latency outliers.

Finally, the onload_acceleration_stack_name (context) option controls the Onload stack for the sockets associated with the entire context. This includes all sockets associated with source and receiver objects, as well as sockets associated with topic resolution, Unicast Immediate Messaging, and a Unix pipe used by UM for internal thread synchronization. Note that if the context stack name option is supplied, any source or receiver scoped stack name options are ignored.

Note
You can set the LBM_SUPPRESS_ONLOAD environment variable to any value to prevent UM from loading Onload. In this case, you cannot use the onload_acceleration_stack_name options.

If your application uses the onload_set_stackname API directly for any non-UM sockets, note that after UM accelerates a transport socket, Ultra Messaging resets the stackname to the default for all threads by calling:

onload_set_stackname(ONLOAD_ALL_THREADS, ONLOAD_SCOPE_NOCHANGE, "");

Ultra Messaging resets the stackname during source creation and when a receiver matched topic opens a transport session.


Using Onload with UM  <-

To enable Onload socket acceleration for only selected transports, perform the following steps:

  1. Install Onload.
  2. Set the Onload environment variable EF_DONT_ACCELERATE = 1 to disable Onload default behavior of accelerating all sockets.
  3. Start the application as in the following example:

    onload <app_name> [app_options]

  4. Set UM stack name configuration options for the application's sources and receivers.
  5. Disable batching to ensure that it is the application thread that sends the data out.
  6. If using multiple XSPs, ensure that all transport sessions associated with each receiver object are handled by the same XSP. Otherwise you can have multiple XSPs handling the same Onload stack, which can introduce latency outliers.
  7. Prevent IP fragmentation by setting the options transport_*_datagram_max_size and option resolver_datagram_max_size (context) to a value 28 bytes smaller than the Solarflare interface's configured MTU size usually 1472). See Message Fragmentation and Reassembly.

Users of Onload are advised to set Dynamic Fragmentation Reduction.

For detailed information about onload, see the Onload User Guide.


Solarflare Tips  <-

Onload does not support IP fragmentation and reassembly, so do not configure UM send datagrams that would require IP fragmentation. See Datagram Max Size and Network MTU.

Warning
Onload does not support both accelerated and non-accelerated processes subscribing to the same multicast group on the same host. An attempt to do so will result in the non-accelerated process becoming "deaf" to the shared multicast group. See the Onload User Guide section Multicast Receive to Onload or Kernel Stack.

For many of our customers, having bursts of many tens or even hundreds of thousands of messages per second is not unusual during a trading day. Message rates this high can stress the networking stack, from switch to NIC and driver to UM. Packet loss can happen, leading to high latency if those packets are successfully recovered, or potentially to Unrecoverable Loss.

Informatica is not an expert in tuning Solarflare NICs and Onload. We recommend using the Onload documentation and discussing your use case with Onload support engineers.

However, we can give a few tips based on our own experience and that of our customers.

  1. The number of receive descriptors (size of rx ring buffer) should always be set to the maximum value (probably 4096, but check to be sure).
    • For kernel driver, check the current and maximum settings with:
      ethtool -g sfdevicename
      You can change it using ethtool:
      ethtool -G rx 4096 sfdevicename
      but this will only stay in effect until the next reboot. Different versions of Linux have different methods for making the changes permanent.
    • For Onload, set the environment variable:
      export EF_RXQ_SIZE=4096
  2. Set the maximum size of datagram that UM will generate.
  3. If using Onload, you can get better performance if you configure sources on one stack using onload_acceleration_stack_name (source) and receivers on a different stack using onload_acceleration_stack_name (receiver).
  4. For a wealth of additional information, see Onload documentation, especially Tuning Onload and Eliminating Drops.


Reference  <-


onload_acceleration_stack_name (context)  <-

The stackname to use when creating an Onload socket.
Sets the stackname when creating Onload sockets on the context. The stackname must be eight characters or less. To disable the stackname, set this option to NULL, which must be all uppercase.
Note: Use of this option requires Onload.
See Onload Stack Names for more information.
Scope: context
Type: string
Default value: NULL
When to Set: Can only be set during object initialization.
Version: This option was implemented in UM 6.16.


onload_acceleration_stack_name (receiver)  <-

The stackname to use when creating an Onload transport data socket.
The stackname must be eight characters or less. Because this is a transport setting, the first receiver applies its configuration option setting, and other receivers that join the transport inherit the setting of the first source. To disable the stackname, set this option to NULL, which must be all uppercase.
Note: Use of this option requires Onload and applies to LBT-RM, LBT-RU, and TCP transports.
See Onload Stack Names for more information.
Scope: receiver
Type: string
Default value: NULL
When to Set: Can only be set during object initialization.
Version: This option was implemented in UM 6.5.


onload_acceleration_stack_name (source)  <-

The stackname to use when creating an Onload transport data socket.
The stackname must be eight characters or less. Because this is a transport setting, the first source applies its configuration option setting, and other sources that join the transport inherit the setting of the first source. To disable the stackname, set this option to NULL, which must be all uppercase.
Note: Use of this option requires Onload and applies to LBT-RM, LBT-RU, and TCP transports.
See Onload Stack Names for more information.
Scope: source
Type: string
Default value: NULL
When to Set: Can only be set during object initialization.
Version: This option was implemented in UM 6.5.


UD Acceleration for Mellanox Hardware Interfaces  <-

UD (Unreliable Datagram) acceleration is a kernel-bypass technology that accelerates sending and receiving UDP traffic and operates with Mellanox 10-Gigabit Ethernet or InfiniBand adapter cards for 64-bit Linux on X86 platforms.

UD acceleration does not support fragmentation and reassembly, so do not send messages larger than the MTU size configured on the Mellanox interface.

UD acceleration is available for the following Ultra Messaging transport types:

  • LBT-RM (UDP-based reliable multicast)
  • LBT-RU (UDP-based reliable unicast)
  • Multicast Immediate Messaging
  • Multicast Topic Resolution

To use UD acceleration, perform the following steps:

  1. Install the Mellanox NIC.
  2. Install the VMA package, which is part of the UD acceleration option .
  3. Include the appropriate transport acceleration options in your Ultra Messaging Configuration File.
  4. Set option transport_*_datagram_max_size and option resolver_datagram_max_size (context) to a value of no more than 28 bytes smaller than the Mellanox interface's configured MTU size.

Users of UD acceleration are advised to make use of Dynamic Fragmentation Reduction.


Reference  <-


resolver_ud_acceleration (context)  <-

Flag indicating if Accelerated Multicast is enabled for Topic Resolution. Accelerated Multicast requires Mellanox InfiniBand or 10 Gigabit Ethernet hardware.
UD Acceleration of topic resolution relies on hardware-supported loopback, which InfiniBand provides, but which the 10 Gigabit Ethernet ConnectX hardware does not.
Note: If 10 Gigabit Ethernet ConnectX hardware is used and multiple UM contexts are desired on the host, this option must be disabled.
Scope: context
Type: int
When to Set: Can only be set during object initialization.
Version: This option was implemented in LBM 5.2.

Value Description
1

Accelerated Topic Resolution is enabled.

0 Accelerated Topic Resolution is not enabled. Default for all.


ud_acceleration (context)  <-

Flag indicating if Accelerated Multicast is enabled for LBT-RM.
Accelerated Multicast requires InfiniBand or 10 Gigabit Ethernet hardware and the purchase and installation of the Ultra Messaging Accelerated Multicast Module. See your Ultra Messaging representative for licensing specifics.
Scope: context
Type: int
When to Set: Can only be set during object initialization.
Version: This option was implemented in LBM 4.1.

Value Description
1

Accelerated Multicast is enabled.

0 Accelerated Multicast is not enabled. Default for all.