Release Notes
|
The most significant updates to UM version 6.13 are enhanced TCP-based Topic Resolution (redundancy and scalability), and Persistence performance improvements.
The following new features and enhancements apply to UMS, UMP, and UMQ products.
Redundant SRS. The Stateful Resolver Service (SRS) can now be deployed redundantly. UM components can be configured for multiple instances of SRS. If one SRS goes down or is cut off from the network, the other SRSes will continue to provide Topic Resolution service. See TCP-Based TR and Fault Tolerance for details.
SRS traffic reduction. The SRS now supports tracking of topics that each context is interested in (has receivers for). This allows for further reduction of TR traffic by filtering source advertisements based on topic interest. Note that this can interfere with the resolver_source_notification_function (context) and resolver_event_function features. See TCP-Based TR Interest for more information.
Configuration error handling. A change has been made to how configuration file errors are handled - they can now be treated as "warnings" instead of "errors". See Configuration Error Handling for details.
The following new features and enhancements apply to UMP and UMQ products.
Performance improvements Persistence throughput is increased of both storage of new messages and recovery of stored messages.
Improved recovery rates. Fixing bug 10877 allowed a change to the default value for Store option repository-disk-max-read-async-cbs from 16 to 10,000. This improves receiver recovery rates. See Special Upgrade Instructions for 6.13.
Thread affinity. The UM Store Daemon can now set thread affinity to "pin" threads to one or more desired CPU cores. This can provide a significant improvement in throughput. See Store Thread Affinity.
Marking messages in Store. Selected messages in the Persistent Store can now be marked as invalid, to prevent them from being delivered to a recovering receiver. This can be useful if a misbehaving publisher sends a "poison" message that causes receivers to crash; having that message in the Store's repository means that restarting the failed receiver will just cause it to crash again when the message is recovered. See Request: Mark Stored Message Invalid.
Forced de-registration. A failed receiver can now be de-registered from a Persistent Store. This will delete the state information for that receiver. See Request: Deregister Receiver.
The following new features and enhancements apply to the UMQ product.
The following new features and enhancements apply to the Dynamic Routing Option (DRO).
The following bug fixes apply to UMS, UMP, and UMQ products.
Change Request | Description |
---|---|
10726 | FIXED: When the multiple_receive_maximum_datagrams (context) option is used, undesired latencies can be introduced by multiple Topic Resolution or MIM datagrams being processed in a row. The option now only affects LBT-RM and LBT-RU transports. It no longer applies to Topic Resolution or MIM. |
10794 | FIXED: In a race condition related to the closing of a socket, it is possible to get: |
10685 | FIXED: In UM version 6.12, a dependency was added to UM for a dynamic library for "rsock", which is used for communication with the SRS. However, a static form of the rsock library was not provided. As of UM version 6.13, the "rsock" code is now provided in static form. It is included in the "liblbm.a" library. |
10841 | FIXED: A hot failover receiver can crash with: |
10822 | FIXED: The SRS Daemon Statistics "clients.DR.inactive.SIR.count" and "clients.DR.inactive.SIR.count" are incorrectly classified as errors and are included in the SRS_ERROR_STATS message type. The counters "clients.DR.inactive.SIR.count" and "clients.DR.inactive.SIR.count" have been moved to be in the SRS_STATS message type. See Message Type: SRS_STATS. |
10777 | FIXED: Inadvertent connection to TCP source from a UIM Request Client results in a memory leak. Note that while the memory leak is fixed, it is still not valid for an application to send a UIM to a port that is used by a TCP source. |
10657 | FIXED: There are cases where the use of a Hot Failover (HF) receiver will inhibit the delivery of unrecoverable Tail Loss events. |
9917 | FIXED: When using LBT-RU, packet lengths are sometimes improperly set according to LBT-RM's maximum datagram size. This can result in errors of the form: |
The following bug fixes apply to UMP and UMQ products.
Change Request | Description |
---|---|
10877 | FIXED: Large values for repository-disk-max-read-async-cbs can lead to unresponsive Stores during large message recovery operations by receivers. As of UM version 6.13, changes in the internal threading model of the Store avoids contention between receiver message recovery handshaking with the Store, allowing higher recovery performance without degrading responsiveness. Associated with this fix is a change in the default value for repository-disk-max-read-async-cbs from 16 to 10,000. Users are recommended not to explicitly set a value for repository-disk-max-read-async-cbs and instead simply allow it to default. |
10849 | FIXED: If using LBT-RU transport with Smart Sources, and two or more sources are mapped to the same transport session, deleting one of them leaves the other sources on the same session corrupted. This can result in segmentation faults or other bad behaviors. |
10818 | FIXED: If a persistent receiver configuration does not set values for ume_state_lifetime (receiver) and ume_activity_timeout (receiver) (both default to zero), the store's value of receiver-state-lifetime is not used to determine the receiver's state lifetime, as it should be. Instead, the value of receiver-activity-timeout is used. |
10867 | FIXED: If a Source initially registers with a Store with Proxy Sources disabled, and then the source restarts and registers with Proxy Sources enabled, the Store fails with a segmentation fault. This sequence no longer causes the Store to crash. However, changing the Proxy Source configuration is not permitted across a re-registration. See Registration Limitations. |
10811 | FIXED: There are certain sequences of shutting down and restarting persistent receivers and stores that can result in multiple entries for the same source registration on the Store Web Monitor. For this problem to happen, the receiver must use the deregistration API. |
10808 | FIXED: If a session ID value is chosen that exceeds 2**31, it is displayed on the Store Web Monitor as a negative number. Session IDs are now displayed as 32-bit unsigned numbers. |
10765 | FIXED: For all UM versions 6.11.* and 6.12.*, the Store logs the warning: This warning is benign and can be ignored. As of UM version 6.13, this warning is removed altogether. |
10737 | FIXED: A registering RPP receiver can get old messages replayed after a store is shut down, the state and cache files deleted, and the store restarted. This bug is caused by the cleanly restarted Store requesting Late Join message recovery from the Source, and the receiver registering with the Store during the late join process. The receiver is mistakenly told to start with the sequence number of the next message to be recovered by Late Join. As of UM version 6.13, when a receiver registers with a store and that store has no state information for that receiver, the receiver will be told to start with the sequence number that the source's live messages are at. |
10733 | FIXED: When a persistent receiver restarts after a period of being down, and the Store's retention policy has led to messages no longer being available, the receiver's persistent delivery controller does not properly use the value of delivery_control_maximum_burst_loss (receiver) when delivering loss events to the receiver callback. That is, even if the number of missing messages exceeds the configured value of delivery_control_maximum_burst_loss (receiver), UM still delivers a separate loss event for every missing sequence number. The persistence delivery controller will now deliver a single "burst loss" event if the number of missing messages exceeds delivery_control_maximum_burst_loss (receiver). |
10661 | FIXED: When sending messages using a Smart Source in Java, if the messages contain message properties and are larger than the datagram maximum size (requiring UM fragmentation), a crash can result. |
10805 | FIXED: When a UM configuration file is supplied to the UMP Element "<daemon-monitor>", source-scoped options are not applied. |
10483 | FIXED: There are rare circumstances under which a persistent receiver becomes deaf to some number of new messages sent by a source due to the receiver's expected sequence number being higher than the source's live sequence numbers. This typically happens when the source is restarted after a period of store overload, or through product misuse. The receiver registration now ensures that the low sequence number is never higher than the high, which allows all live messages sent by the source to be received. See Persistent Receiver and Duplicate Messages for more information. |
The following bug fixes apply to the UMQ product.
The following bug fixes apply to the Dynamic Routing Option (DRO).
Change Request | Description |
---|---|
10856 | FIXED: A DRO can crash with: |
10826 | FIXED: A DRO can crash with a segmentation fault when used with an SRS for TCP-based Topic Resolution. In a stack backtrace it happens in lbm_topic_resolve_process_tir(). |
10588 | FIXED: When a UM configuration file is supplied to the Router Element "<daemon-monitor>", source-scoped options are not applied. |
Users must specify an interface to "lbmrd".
Due to fixing bug 10937, some users of UM version 6.13 and beyond will need to slightly change their Unicast UDP Topic Resolution daemon "lbmrd" usage. If you do not specify an interface for lbmrd, it will appear that lbmrd doesn't work, although no error messages will appear in its log file.
Starting with UM version 6.13, an interface specification must be provided, either via the command-line using the "-i" option (see Lbmrd Man Page), or via the "<interface>" XML configuration element (see LBMRD Element "<interface>"). Note that an easy and flexible method for specifying the interface is using CIDR notation, such as "10.0.0.0/8". This matches any interface whose IP address starts with 10.
Applications running UM version 6.13 and beyond have a potential incompatibility with pre-6.13 DROs. Due to Known Limitation 10833, Informatica recommends upgrading DROs before Persistent Stores. This only applies to users who upgrade gradually; for users who upgrade everything at once, this issue does not apply. See Order of Upgrades.
As a result of improving the overall performance of the Store, there is a condition where receiver message recovery performance can be significantly degraded if repository-disk-max-read-async-cbs is configured for a small number. In the past, this option needed to be small, due to bug 10877. However, as of UM version 6.13, this restriction has been lifted, and the default value changed to a more appropriate value.
If your Store configuration includes repository-disk-max-read-async-cbs, Informatica recommends removing the setting and allowing it to operate with its default value.
This is for users who have written monitoring applications that consume SRS Daemon Statistics. The counters "clients.DR.inactive.SIR.count" and "clients.DR.inactive.SIR.count" have been reclassified as non-errors, and moved to message type SRS_STATS. See bug 10822.
Due to fixing bug 10483, there is a certain use case in which there is an increased chance of a persistent receiver getting messages re-delivered that it had already processed (duplicate messages). It is important for persistent receivers to be able to detect duplicate messages after a registration; see Duplicate Message Delivery.
The use case affected by this bug fix is where a restarted sending application is able to re-send previously sent messages based on the starting sequence number. When a restarted source registers with the Stores, the Stores can tell the sender to start at an earlier sequence number than was previously sent. This can happen, for example, if the Stores experience tail loss right before the source restarts. It is unlikely, but possible.
In this use case, the sending application looks at the starting sequence number from the store and is able to reconstruct the previously-sent messages, starting at the given sequence number. This has the advantage that the Stores are correctly populated with the missing message data.
Starting with UM version 6.13, all receiving applications will have these messages delivered. This can be true even if a receiver has received those messages and acknowledged them to the Store. So with the use case where the sender is re-sending previously-sent messages, UM will deliver these messages again to the receiving application. However, note that if the application has successfully ACKed messages and UM is about to re-delivery them, the UM will log the message:
Core-10483-1: WARNING: UMP receiver Low sequence number[0xx] is greater than the High sequence number [0xx] + 1. Setting to high + 1. Receiver will join the live stream and no messages will be recovered.
Note that the sequence numbers will match the previous transmissions. The application can keep track of the highest sequence number received for that source and skip any delivered messages equal to or less than that saved number. (For use cases where the sender always starts sending new messages after a restart, do not skip those messages.)
Also note that some applications use the LBM_MSG_FLAG_UME_RETRANSMIT flag bit in a message's lbm_msg_t_stct::flags field to indicate the message is recovered from the store and therefore subject to possible duplication. However, in the case of a restarted source, the messages are "live" and therefore don't have the LBM_MSG_FLAG_UME_RETRANSMIT flag set.
Starting with UM version 6.13, UM Daemons will not be supported on platforms other than Linux and Windows.
If you are upgrading from a UM version prior to 6.13, you must also examine the Special Upgrade Instructions for 6.12.