Concepts Guide
|
The TCP UM transport uses normal TCP connections to send messages from sources to receivers. This is the default transport when it's not explicitly set. TCP is a good choice when:
UM's TCP transport includes a Session ID. A UM source using the TCP transport generates a unique, 32-bit non-zero random Session ID for each TCP transport (IP:port) it uses. The source also includes the Session ID in its Topic Resolution advertisement (TIR). When a receiver resolves its topic and discovers the transport information, the receiver also obtains the transport's Session ID. The receiver sends a message to the source to confirm the Session ID.
The TCP Session ID enables multiple receivers for a topic to connect to a source across a UM Router. In the event of a UM Router failure, UM establishes new topic routes which can cause cached Topic Resolution and transport information to be outdated. Receivers use this cached information to find sources. Session IDs add a unique identifier to the cached transport information. If a receiver tries to connect to a source with outdated transport information, the source recognizes an incorrect Session ID and disconnects the receiver. The receiver can then attempt to reconnect with different cached transport information.
The LBT-RU UM transport adds reliable delivery to unicast UDP to send messages from sources to receivers. This provides greater flexibility in the control of latency. For example, the application can further limit latency by allowing the use of arrival order delivery. See the Knowledge Base article, FAQ: How do arrival-order delivery and in-order delivery affect latency?. Also, LBT-RU is less sensitive to overall network load; it uses source rate controls to limit its maximum send rate.
Since it is based on unicast addressing, LBT-RU can pass through most firewalls. However, it has the same scaling issues as TCP when multiple receivers are present for each topic.
UM's LBT-RU transport includes a Session ID. A UM source using the LBT-RU transport generates a unique, 32-bit non-zero random Session ID for each transport it uses. The source also includes the Session ID in its Topic Resolution advertisement (TIR). When a receiver resolves its topic and discovers the transport information, the receiver also obtains the transport's Session ID.
The LBT-RU Session ID enables multiple receivers for a topic to connect to a source across a UM Router. In the event of a UM Router failure, UM establishes new topic routes which can cause cached Topic Resolution and transport information to be outdated. Receivers use this cached information to find sources. Session IDs add a unique identifier to the cached transport information. If a receiver tries to connect to a source with outdated transport information, the transport drops the received data and times out. The receiver can then attempt to reconnect with different cached transport information.
The LBT-RM transport adds reliable multicast to UDP to send messages. This provides the maximum flexibility in the control of latency. In addition, LBT-RM can scale effectively to large numbers of receivers per topic using network hardware to duplicate messages only when necessary at wire speed. One limitation is that multicast is often blocked by firewalls.
LBT-RM is a UDP-based, reliable multicast protocol designed with the use of UM and its target applications specifically in mind. The protocol is very similar to PGM, but with changes to aid low latency messaging applications.
UM's LBT-RM transport includes a Session ID. A UM source using the LBT-RM transport generates a unique, 32-bit non-zero random Session ID for each transport it uses. The source also includes the Session ID in its Topic Resolution advertisement (TIR). When a receiver resolves its topic and discovers the transport information, the receiver also obtains the transport's Session ID.
The LBT-IPC transport is an Interprocess Communication (IPC) UM transport that allows sources to publish topic messages to a shared memory area managed as a static ring buffer from which receivers can read topic messages. Message exchange takes place at memory access speed which can greatly improve throughput when sources and receivers can reside on the same host. LBT-IPC can be either source-paced or receiver-paced.
The LBT-IPC transport uses a "lock free" design that eliminates calls to the Operating System and allows receivers quicker access to messages. An internal validation method enacted by receivers while reading messages from the Shared Memory Area ensures message data integrity. The validation method compares IPC header information at different times to ensure consistent, and therefore, valid message data. Sources can send individual messages or a batch of messages, each of which possesses an IPC header.
The following diagram illustrates the Shared Memory Area used for LBT-IPC:
Header
The Header contains information about the shared memory area resource.
Receiver Pool
The receiver pool is a collection of receiver connections maintained in the Shared Memory Area. The source reads this information if you've configured receiver-pacing to determine if a message can be reclaimed or to monitor a receiver. Each receiver is responsible for finding a free entry in the pool and marking it as used.
Source-to-Receiver Message Buffer
This area contains message data. You specify the size of the shared memory area with a source option, transport_lbtipc_transmission_window_size (source). The size of the shared memory area cannot exceed your platform's shared memory area maximum size. UM stores the memory size in the shared memory area's header. The Old Message Start and New Message Start point to positions in this buffer.
When you create a source with lbm_src_create() and you've set the transport option to IPC, UM creates a shared memory area object. UM assigns one of the transport IDs to this area specified with the UM context configuration options, transport_lbtipc_id_high (context) and transport_lbtipc_id_low (context). You can also specify a shared memory location outside of this range with a source configuration option, transport_lbtipc_id (source), to prioritize certain topics, if needed.
UM names the shared memory area object according to the format, LBTIPC_x_d where x is the hexadecimal Session ID and d is the decimal Transport ID. Example names are LBTIPC_42792ac_20000 or LBTIPC_66e7c8f6_20001. Receivers access a shared memory area with this object name to receive (read) topic messages.
Using the configuration option, transport_lbtipc_behavior (source), you can choose source-paced or receiver-paced message transport. See Transport LBT-IPC Operation Options for more information.
Sending over LBT-IPC
To send on a topic (write to the shared memory area) the source writes to the Shared Memory Area starting at the Oldest Message Start position. It then increments each receiver's Signal Lock if the receiver has not set this to zero.
Receivers operate identically to receivers for all other UM transports. A receiver can actually receive topic messages from a source sending on its topic over TCP, LBT-RU or LBT-RM and from a second source sending on LBT-IPC with out any special configuration. The receiver learns what it needs to join the LBT-IPC session through the topic advertisement.
The configuration option transport_lbtipc_receiver_thread_behavior (context) controls the IPC receiving thread behavior when there are no messages available. The default behavior, 'pend', has the receiving thread pend on a semaphore for a new message. When the source adds a message, it posts to each pending receiver's semaphore to wake the receiving thread up. Alternatively, 'busy_wait' can be used to prevent the receiving thread going to sleep. In this case, the source does not need to post to the receiver's semaphore. It simply adds the message to shared memory, which the looping receiving thread detects with the lowest possible latency.
Although 'busy_wait' has the lowest latency, it has the drawback of consuming 100% of a CPU core during periods of idleness. This limits the number of IPC data flows that can be used on a given machine to the number of available cores. (If more busy looping receivers are deployed than there are cores, then receivers can suffer 10 millisecond time sharing quantum latencies.)
For application that cannot afford 'busy_wait', there is another configuration option, transport_lbtipc_pend_behavior_linger_loop_count (context), which allows a middle ground between 'pend' and 'busy_wait'. The receiver is still be configured as 'pend', but instead of going to sleep on the semaphore immediately upon emptying the shared memory, it busy loops for the configured number of times. If a new message arrives, it processes the message immediately without a sleep/wakeup. This can be very useful during bursts of high incoming message rates to reduce latency. By making the loop count large enough to cover the incoming message interval during a burst, only the first message of the burst will incur the wakeup latency.
Topic Resolution and LBT-IPC
Topic resolution operates identically with LBT-IPC as other UM transports albeit with a new advertisement type, LBMIPC. Advertisements for LBT-IPC contain the Transport ID, Session ID and Host ID. Receivers obtain LBT-IPC advertisements in the normal manner (resolver cache, advertisements received on the multicast resolver address:port and responses to queries.) Advertisements for topics from LBT-IPC sources can reach receivers on different machines if they use the same topic resolution configuration, however, those receivers silently ignore those advertisements since they cannot join the IPC transport. See Sending to Both Local and Remote Receivers.
Receiver Pacing
Although receiver pacing is a source behavior option, some different things must happen on the receiving side to ensure that a source does not reclaim (overwrite) a message until all receivers have read it. When you use the default transport_lbtipc_behavior (source) (source-paced), each receiver's Oldest Message Start position in the Shared Memory Area is private to each receiver. The source writes to the Shared Memory Area independently of receivers' reading. For receiver-pacing, however, all receivers share their Oldest Message Start position with the source. The source will not reclaim a message until all receivers have successfully read that message.
Receiver Monitoring
To ensure that a source does not wait on a receiver that is not running, the source monitors a receiver via the Monitor Shared Lock allocated to each receiving context. (This lock is in addition to the semaphore already allocated for signaling new data.) A new receiver takes and holds the Monitor Shared Lock and releases the resource when it dies. If the source is able to obtain the resource, it knows the receiver has died. The source then clears the receiver's In Use flag in it's Receiver Pool Connection.
Although no actual network transport occurs, IPC functions in much the same way as if you send packets across the network as with other UM transports.
A source application that wants to support both local and remote receivers should create two UM Contexts with different topic resolution configurations, one for IPC sends and one for sends to remote receivers. Separate contexts allows you to use the same topic for both IPC and network sources. If you simply created two source objects (one IPC, one say LBT-RM) in the same UM Context, you would have to use separate topics and suffer possible higher latency because the sending thread would be blocked for the duration of two send calls.
A UM source will never automatically use IPC when the receivers are local and a network transport for remote receivers because the discovery of a remote receiver would hurt the performance of local receivers. An application that wants transparent switching can implement it in a simple wrapper.
The following diagram illustrates how sources and receivers interact with the shared memory area used in the LBT-IPC transport:
In the diagram above, 3 sources send (write) to two Shared Memory Areas while four receivers in two different contexts receive (read) from the areas. The assignment of sources to Shared Memory Areas demonstrate UM's round robin method. UM assigns the source sending on Topic A to Transport 20001, the source sending on Topic B to Transport 20002 and the source sending on Topic C back to the top of the transport ID range, 20001.
The diagram also shows the UM configuration options that set up this scenario:
LBT-IPC requires no special operating system authorities, except on Microsoft Windows Vista and Microsoft Windows Server 2008, which require Administrator privileges. In addition, on Microsoft Windows XP, applications must be started by the same user, however, the user is not required to have administrator privileges. In order for applications to communicate with a service, the service must use a user account that has Administrator privileges.
LBT-IPC contexts and sources consume host resources as follows:
Across most operating system platforms, these resources have the following limits.
Consult your operating system documentation for specific limits per type of resource. Resources may be displayed and reclaimed using the LBT-IPC Resource Manager. See also the KB article Managing LBT-IPC Host Resources.
Deleting an IPC source or deleting an IPC receiver reclaims the shared memory area and locks allocated by the IPC source or receiver. However, if a less than graceful exit from a process occurs, global resources remain allocated but unused. To address this possibility, the LBT-IPC Resource Manager maintains a resource allocation database with a record for each global resource (memory or semaphore) allocated or freed. You can use the LBT-IPC Resource Manager to discover and reclaim resources. See the three example outputs below.
Displaying Resources
Reclaiming Unused Resources
The LBT-SMX (shared memory acceleration) transport is an Interprocess Communication (IPC) transport you can use for the lowest latency message Streaming. LBT-SMX is faster than the LBT-IPC transport. Like LBT-IPC, sources can publish topic messages to a shared memory area from which receivers can read topic messages. Unlike LBT-IPC, the native APIs for the LBT-SMX transport are not thread safe and do not support all UM features such as message batching or fragmentation.
You can use either the native LBT-SMX API calls, lbm_src_buff_acquire() and lbm_src_buffs_complete() to send over LBT-SMX or you can use lbm_src_send_*() API calls. The existing send APIs are thread safe with SMX, but they incur a synchronization overhead and thus are slower than the native LBT-SMX API calls.
LBT-SMX operates on the following Ultra Messaging 64-bit packages:
The example applications, lbmlatping.c and lbmlatpong.c show how to use the C LBT-SMX API calls. For Java, see lbmlatpong.java and lbmlatpong.java. For .NET, see lbmlatpong.cs and lbmlatpong.cs.
Other example applications can use the LBT-SMX transport with the use of a UM configuration flat file containing 'source transport lbtsmx'. You cannot use LBT-SMX with example applications for features not supported by LBT-SMX, such as lbmreq, lbmresp, lbmrcvq or lbmwrcvq.
The LBT-SMX configuration options are similar to the LBT-IPC transport options. See Transport LBT-SMX Operation Options for more information.
You can use Automatic Monitoring, UM API retrieve/reset calls, and LBMMON APIs to access LBT-SMX source and receiver transport statistics. To increase performance, the LBT-SMX transport does not collect statistics by default. Set the UM configuration option transport_lbtsmx_message_statistics_enabled (context) to 1 to enable the collection of transport statistics.
When you create a source with lbm_src_create() and you've set the source's transport configuration option to LBT-SMX, UM creates a shared memory area object. UM assigns one of the transport IDs to this area from a range of transport IDs specified with the UM context configuration options, transport_lbtsmx_id_high (context) and transport_lbtsmx_id_low (context). You can also specify a shared memory location inside or outside of this range with a source configuration option, transport_lbtsmx_id (source), to group certain topics in the same shared memory area, if needed. See Transport LBT-SMX Operation Options in the UM Configuration Guide.
UM names the shared memory area object according to the format, LBTSMX_x_d where x is the hexadecimal Session ID and d is the decimal Transport ID. Example names are LBTSMX_42792ac_20000 or LBTSMX_66e7c8f6_20001. Receivers access a shared memory area with this object name to receive (read) topic messages.
Sending on a topic with the native LBT-SMX APIs requires the two API calls lbm_src_buff_acquire() and lbm_src_buffs_complete(). A third convenience API, lbm_src_buffs_complete_and_acquire(), combines a call to lbm_src_buffs_complete() followed by a call to lbm_src_buff_acquire() into one function call to eliminate the overhead of an additional function call.
The native LBT-SMX APIs fail with an appropriate error message if a sending application uses them for a source configured to use a transport other than LBT-SMX.
Sending with LBT-SMX's native API is a two-step process.
The sending application first calls lbm_src_buff_acquire(), which returns a pointer into which the sending application writes the message data.
The pointer points directly into the shared memory region. UM guarantees that the shared memory area has at least the value specified with the len parameter of contiguous bytes available for writing when lbm_src_buff_acquire() returns. If your application set the LBM_SRC_NONBLOCK flag with lbm_src_buff_acquire(), UM returns an LBM_EWOULDBLOCK error condition if the shared memory region does not have enough contiguous space available.
Because LBT-SMX does not support fragmentation, your application must limit message lengths to a maximum equal to the value of the source's configured transport_lbtsmx_datagram_max_size (source) option minus 16 bytes for headers. In a system deployment that includes the DRO, this value should be the same as the datagram max sizes of other transport types. See Protocol Conversion.
After the user acquires the pointer into shared memory and writes the message data, the application may call lbm_src_buff_acquire() repeatedly to send a batch of messages to the shared memory area. If your application writes multiple messages in this manner, sufficient space must exist in the shared memory area. lbm_src_buff_acquire() returns an error if the available shared memory space is less than the size of the next message.
LBT-SMX supports lbm_src_send_* API calls. These API calls are fully thread-safe. The LBT-SMX feature restrictions still apply, however, when using lbm_src_send_* API calls. The lbm_src_send_ex_info_t argument to the lbm_src_send_ex() and lbm_src_sendv_ex() APIs must be NULL when using an LBT-SMX source, because LBT-SMX does not support any of the features that the lbm_src_send_ex_info_t parameter can enable. See Differences Between LBT-SMX and Other UM Transports.
Since LBT-SMX does not support an implicit batcher or corresponding implicit batch timer, UM flushes all messages for all sends on LBT-SMX transports done with lbm_src_send_* APIs, which is similar to setting the LBM_MSG_FLUSH flag. LBT-SMX also supports the lbm_src_flush() API call, which behaves like a thread-safe version of lbm_src_buffs_complete().
The lbm_src_topic_alloc() API call generates log warnings if the given attributes specify an LBT-SMX transport and enable any of the features that LBT-SMX does not support. The lbm_src_topic_alloc() call succeeds, but UM does not enable the unsupported features indicated in the log warnings. Other API functions that operate on lbm_src_t objects, such as lbm_src_create(), lbm_src_delete(), or lbm_src_topic_dump(), operate with LBT-SMX sources normally.
Because LBT-SMX does not support fragmentation, your application must limit message lengths to a maximum equal to the value of the source's configured transport_lbtsmx_datagram_max_size (source) option minus 16 bytes for headers. Any send API calls with a length parameter greater than this configured value fail. In a system deployment that includes the DRO, this value should be the same as the datagram max sizes of other transport types. See Protocol Conversion.
Receivers operate identically over LBT-SMX to receivers as all other UM transports. The msg->data pointer of a delivered lbm_msg_t object points directly into the shared memory region.
The lbm_msg_retain() API function operates differently for LBT-SMX. lbm_msg_retain() creates a full copy of the message in order to access the data outside the receiver callback.
Topic Resolution and LBT-SMX
Topic resolution operates identically with LBT-SMX as other UM transports albeit with the advertisement type, LBMSMX. Advertisements for LBT-SMX contain the Transport ID, Session ID, and Host ID. Receivers get LBT-SMX advertisements in the normal manner, either from the resolver cache, advertisements received on the multicast resolver address:port, or responses to queries.
Although no actual network transport occurs, SMX functions in much the same way as if you send packets across the network as with other UM transports.
You cannot use the following UM features with LBT-SMX:
You also cannot use LBT-SMX to send egress traffic from a UM daemon, such as the Persistent Store, UM Router, UM Cache, or UMDS.
The following diagram illustrates how sources and receivers interact with the shared memory area used in the LBT-SMX transport.
In the diagram above, three sources send (write) to two Shared Memory Areas while four receivers in two different contexts receive (read) from the areas. The assignment of sources to Shared Memory Areas demonstrate UM's round robin method. UM assigns the source sending on Topic A to Transport 30001, the source sending on Topic B to Transport 30002 and the source sending on Topic C back to the top of the transport ID range, 30001.
The diagram also shows the UM configuration options that set up this scenario.
The option transport_lbtsmx_transmission_window_size (source) sets the size of each Shared Memory Area to 33554432 bytes or 32 MB. This option's value must be a power of 2. If you configured the transmission window size to 25165824 bytes (24 MB) for example, UM logs a warning message and then rounds the value of this option up to the next power of 2 or 33554432 bytes or 32 MB.
The Java code examples for LBT-SMX send and receive one million messages. Start the receiver example application before you start the source example application.
Java Source Example
The source sends one million messages using the native LBT-SMX Java APIs. The sendMessages() method obtains a reference to the source's message buffer, which does not change for the life of the source. The method acquireMessageBufferPosition(int, int) contains the requested message length of 8 bytes. When this call returns, it gives an integer position into the previously obtained messages buffer, which is the position of the message data. UM guarantees that you can safely write the value of the counter i into the buffer at this position.
Java Receiver Example
The receiver reads messages from an LBT-SMX Source using the new API on LBMMessage. The example extends the LBMReceiver class so that you can overwrite the onReceive() method, which bypasses synchronization of multiple receiver callbacks. As a result, the addReceiver() and removeReceiver() methods do not work with this class, but we don't want them anyway. In the overridden onReceive() callback, we call getMessagesBuffer(), which returns a ByteBuffer view of the underlying transport. This allows the application to do zero copy reads directly from the memory that stores the message data. The returned ByteBuffer position and limit is set to the beginning and end of the message data. The message data does not start at position 0. The application reads a long out of the buffer, which is the same long that was placed by the source example.
Batching Example
You can implement a batching algorithm at the source by doing multiple acquires before calling complete. When receivers notice that there are new message available, they deliver all new messages in a single loop.
Blocking and Non-blocking Sends Example
By default, acquireMessageBufferPosition() waits for receivers to catch up before it writes the requested number of bytes to the buffer. The resulting spin wait block happens only if you did not set the flags argument to LBM.SRC_NONBLOCK. If the flags argument sets the LBM.SRC_NONBLOCK value, then the function returns -1 if the call would have blocked. For performance reasons, acquireMessageBufferPosition() does not throw new LBMEWouldBlock exceptions like standard send APIs.
Complete and Acquire Function
The function, messageBuffersCompleteAndAcquirePosition(), is a convenience function for the source and calls messageBuffersComplete() followed immediately by acquireMessageBufferPosition(), which reduces the number of method calls per message.
The .NET code examples for LBT-SMX send and receive one million messages. Start the receiver example application before you start the source example application.
.NET Source Example
You can access the shared memory region directly with the IntPtr structs. The src.buffAcquire() API modifies writePtr to point to the next available location in shared memory. When buffAcquire() returns, you can safely write to the writePtr location up to the length specified in buffAcquire(). The Marshal.WriteInt64() writes 8 bytes of data to the shared memory region. The call to buffsComplete() signals new data to connected receivers.
.NET Receiver Example
The application calls the simpleRcv::onReceive callback after the source places new data in the shared memory region. The msg.dataPointerSafe() API returns an IntPtr to the data, which does not create any new objects. The Marshal.ReadInt64 API then reads data directly from the shared memory.
Batching
You can implement a batching algorithm at the source by doing multiple acquires before calling complete. When receivers notice that new message are available, they deliver all new messages in a single loop.
Blocking and Non-blocking Sends
By default, buffAcquire() waits for receivers to catch up before it writes the requested number of bytes to the buffer. The resulting spin wait block happens only if you did not set the flags argument to LBM.SRC_NONBLOCK. If the flags argument sets the LBM.SRC_NONBLOCK value, then the function returns -1 if the call would have blocked. For performance reasons, buffAcquire() does not throw new LBMEWouldBlock exceptions like standard send APIs.
Complete and Acquire Function
The function, buffsCompleteAndAcquire(), is a convenience function for the source and calls buffsComplete() followed immediately by buffAcquire(), which reduces the number of method calls per message.
Reduce Synchronization Overhead
Delivery latency to an LBMReceiver callback can be reduced with a single callback. Call LBMReceiverAttributes::enableSingleReceiverCallback on the attributes object used to create the LBMReceiver. The addReceiver() and removeReceiver() APIs become defunct, and your application calls the application receiver callback without any locks taken. The enableSingelReceiverCallback() API eliminates callback related synchronization overhead.
Increase Performance with unsafe Code Constructs
Using .NET unsafe code constructs can increase performance. By manipulating pointers directly, you can eliminate calls to external APIs, resulting in lower latencies.
Deleting an SMX source or deleting an SMX receiver reclaims the shared memory area and locks allocated by the SMX source or receiver. However, if an ungraceful exit from a process occurs, global resources remain allocated but unused. To address this possibility, the LBT-SMX Resource Manager maintains a resource allocation database with a record for each global resource (memory or semaphore) allocated or freed. You can use the LBT-SMX Resource Manager to discover and reclaim resources. See the three example outputs below.
Displaying Resources
Reclaiming Unused Resources
With the UMQ product, you use the 'broker' transport to send messages from a source to a Queuing Broker, or from a Queuing Broker to a receiver.
When sources or receivers connect to a Queuing Broker, you must use the 'broker' transport. You cannot use the 'broker' transport with UMS or UMP products.