Some Ultra Messaging® configuration options are related in ways that might not be immediately apparent. Changing the value for one option without adjusting its related option can cause problems such as NAK storms, tail loss, etc. This section identifies these relationships and recommends a best practice for setting the interrelated options.
The following sections discuss configuration option relationships.
The NAK generation interval should be sufficiently longer than the NAK backoff interval so that the source, after receiving the first NAK from a receiver, has time to retransmit the missing datagram and prevent a NAK storm from all receivers. LBTRM, LBTRU, and MIM all use NAK generation and backoff intervals. The NAK behavior for all transports is the same.
Interrelated Options:
Recommendation:
Set the NAK generation interval to at least 2x the NAK backoff interval.
For more, see also Transport LBT-RM Reliability Options, Transport LBT-RU Reliability Options, or Multicast Immediate Messaging Reliability Options.
Example:
# # +------------------------------------------------------------------------------+ # | To avoid NAK storms, set NAK generation interval to at least 2x the | # | NAK backoff interval. | # +------------------------------------------------------------------------------+ # receiver transport_lbtrm_nak_backoff_interval 200 receiver transport_lbtrm_nak_generation_interval 10000
Tail loss describes a situation where the last few messages sent by a publisher before it exits are not received by a subscriber. A TSNI active threshold that is too small relative to the TSNI and/or NAK generation interval may cause tail loss, especially with ordered delivery.
Interrelated Options:
Recommendation:
set the TSNI active threshold to at least 4x the topic sequence number info interval (TSNI) plus the NAK generation interval.
For more, see Transport LBT-RM Reliability Options or Transport LBT-RU Reliability Options.
Example:
# # +-------------------------------------------------------------------------------+ # | To avoid tail loss, set transport_topic_sequence_number_info_active_threshold | # | to at least the sum of 4x the topic sequence number interval plus the NAK | # | generation interval. | # | NOTE: resolver_active_threshold is in seconds. | # +-------------------------------------------------------------------------------+ # source transport_topic_sequence_number_info_interval 2000 receiver transport_lbtrm_nak_generation_interval 10000 source transport_topic_sequence_number_info_active_threshold 60
With an LBT-IPC transport, an activity timeout that is too small relative to the session message interval may cause receiver deafness. If a timeout is too short, the keepalive messages might not be received in time to prevent the receiver from being deleted or disconnecting because the source appears to be gone.
Interrelated Options:
Recommendations:
set the activity timeout to at least 2x the session message interval
For more, see Transport LBT-IPC Operation Options.
Example:
# # +------------------------------------------------------------------------------+ # | To avoid receiver deafness: | # | - set client activity timeout to at least 2x the acknowledgement interval. | # | - set activity timeout to at least 2x the session message interval. | # +------------------------------------------------------------------------------+ # receiver transport_lbtipc_activity_timeout 60000 source transport_lbtipc_sm_interval 10000
An LBT-RM or LBT-RU receiver-side quiescent timeout may delete a transport session that a source is still active on. This can happen if the timeout is too short relative to the source's interval between session messages (which serve as a session keepalive).
Interrelated Options:
Recommendations:
set the receiver LBT-RM or LBT-RU activity timeout to at least 3x the source session message maximum interval
For more, see Transport LBT-RM Operation Options or Transport LBT-RU Operation Options.
Example:
# # +------------------------------------------------------------------------------+ # | To avoid erroneous session timeouts, set receiver transport activity | # | timeout to at least 3x the source session message maximum interval. | # +------------------------------------------------------------------------------+ # receiver transport_lbtrm_activity_timeout 60000 source transport_lbtrm_sm_maximum_interval 10000 receiver transport_lbtru_activity_timeout 60000 source transport_lbtru_sm_maximum_interval 10000
Sometimes it is easy to accidentally reverse the low and high values for LBT-RM multicast addresses, which actually creates a very large range. Aside from excluding intended addresses, this can cause error conditions.
Interrelated Options:
Recommendations:
ensure that the intended low and high values for LBT-RM multicast addresses are not reversed
For more, see Transport LBT-RM Network Options.
Example:
# # +------------------------------------------------------------------------------+ # | To avoid incorrect LBT-RM multicast address ranges, ensure that you have not | # | reversed the low and high values. | # +------------------------------------------------------------------------------+ # context transport_lbtrm_multicast_address_low 224.10.10.10 context transport_lbtrm_multicast_address_high 224.10.10.14
Note: These interrelations apply only to the Ultra Messaging Persistence or Ultra Messaging Queuing Edition.
A store or queue may be erroneously declared unresponsive if its activity timeout expires before it has had adequate opportunity to verify it is still active via activity check intervals.
Interrelated Options:
ume_store_activity_timeout
ume_store_check_interval
umq_queue_activity_timeout
umq_queue_check_interval
Recommendations:
set the store or queue activity timeout to at least 5x the activity check interval
For more, see the UM Configuration Guide, 4.29. Ultra Messaging Persistence Options and/or (if using UM Queuing Edition), the UM Configuration Guide, 4.30. Ultra Messaging Queuing Options.
Example:
# # +------------------------------------------------------------------------------+ # | To avoid erroneous store or queue activity timeouts, set the activity | # | timeout to at least 5x the activity check interval. | # +------------------------------------------------------------------------------+ # source ume_store_activity_timeout 3000 source ume_store_check_interval 500 context umq_queue_activity_timeout 3000 context umq_queue_check_interval 500
Note: These interrelations apply only to the Ultra Messaging Queuing Edition.
A ULB source or receiver may be erroneously declared unresponsive if its activity timeout expires before it has had adequate opportunities to attempt to re-register via activity check intervals if the source appears to be inactive. It is also possible for sources to attempt to reassign messages that have already been processed.
Interrelated Options:
umq_ulb_source_activity_timeout
umq_ulb_source_check_interval
umq_ulb_application_set_message_reassignment_timeout
umq_ulb_application_set_receiver_activity_timeout
umq_ulb_check_interval
Recommendations:
set the ULB source activity timeout to at least 5x the ULB source activity check interval
set the ULB application set message reassignnment timeout to at least 5x the ULB check interval
set the ULB receiver activity timeout to at least 5x the ULB check interval
For more (if using UM Queuing Edition), see the UM Configuration Guide, 4.30. Ultra Messaging Queuing Options.
Example:
# # +------------------------------------------------------------------------------+ # | To avoid erroneous ULB source, receiver or application set message activity | # | timeouts, set the activity timeout to at least 5x the activity check | # | interval. | # +------------------------------------------------------------------------------+ # receiver umq_ulb_source_activity_timeout 10000 receiver umq_ulb_source_check_interval 1000 source umq_ulb_application_set_message_reassignment_timeout 50000 source umq_ulb_application_set_receiver_activity_timeout 10000 source umq_ulb_check_interval 1000
A unicast resolver daemon may be erroneously declared inactive if its activity timeout expires before it has had adequate opportunity to verify that it is still alive.
Interrelated Options:
Recommendations:
Set the unicast resolver daemon activity timeout to at least 5x the activity check interval. Or, if activity notification is not desired, set both options to 0.
For more, see Resolver Operation Options.
Example:
# # +------------------------------------------------------------------------------+ # | To avoid erroneous unicast resolver daemon timeouts, set the activity | # | timeout to at least 5x the activity check interval. | # +------------------------------------------------------------------------------+ # context resolver_unicast_activity_timeout 1000 context resolver_unicast_check_interval 200
If during a Late Join operation, a transport times out while a receiver is requesting retransmission of missing messages, this can cause lost messages to go undetected and likely become unrecoverable.
Interrelated Options:
Recommendations:
set the Late Join retransmit request interval to a value less than its transport's activity timeout value
For more, see Late join Options and also the applicable Transport LBT-RU Operation Options section.
Example:
# # +------------------------------------------------------------------------------+ # | To avoid a transport inactivity timeout while requesting Late Join | # | retransmissions, set the Late Join retransmit request interval to a value | # | less than its transport's activity timeout. | # +------------------------------------------------------------------------------+ # receiver retransmit_request_generation_interval 10000 receiver transport_lbtrm_activity_timeout 60000
It is possible that an unrecoverable loss due to unsatisfied NAKs or a transport activity timeout may go unreported if the delivery controller loss check is disabled or has too long an interval. For UMP stores, the loss check interval must be enabled. Two options (three, if using LBT-RM) are interrelated and must be set according to the guidelines below.
Interrelated Options:
Recommendations:
For LBT-RM, set the transport activity timeout to value greater than the sum of the delivery control loss check interval and the NAK generation interval. Also, set the NAK generation interval to at least 4x the delivery control loss check interval.
for LBT-RU, set the transport activity timeout to value greater than the delivery control loss check interval
for UMP, always enable and set accordingly the delivery control loss check interval when configuring a store
For more, see Delivery Control Options.
Example:
# # +------------------------------------------------------------------------------+ # | To avoid undetected or unreported loss, set NAK generation to 4x the delivery| # | control check interval, and ensure that these two combined are less than the | # | transport activity timeout | # +------------------------------------------------------------------------------+ # receiver delivery_control_loss_check_interval 2500 receiver transport_lbtrm_activity_timeout 60000 receiver transport_lbtrm_nak_generation_interval 10000
Copyright (c) 2004 - 2014 Informatica Corporation. All rights reserved.