4.20. Late Join Options

4.20.1. Late Join Recovery

4.20.1.1. Overview

Late Join allows sources to save a predefined amount of their messaging traffic for late-joining receivers. Sources set the configuration options that determine whether they use Late Join or not, and receivers set options that determine whether they will participate in Late Join recovery if sources use Late Join.

UMP's persistent store is built on Late Join technology. In the Estimating Recovery Time discussion below, the terms "Late Join buffers" and "UMP store" are roughly equivalent.

For more, review the Late Join section in the Concepts Guide, especially Configuring Late Join for Large Numbers of Messages.

4.20.1.2. Estimating Recovery Time

To estimate Late Join recovery time R in minutes, use the formula: R = D / ( 1 - ( txrate / rxrate ) ) where:

  • D is the downtime (in minutes) across all receivers

  • txrate is the average transmission rate of normal messages from sources during recovery (in kmsgs/sec)

  • rxrate is the average recovery rate from source-side Late Join buffers during recovery (in kmsgs/sec)

For example, consider the following scenario:

  • D = 10 minutes

  • txrate = 10k messages / second

  • rxrate = 25k messages / second

Plugging these values into the formula gives an estimated recovery time in minutes: R = 10 / ( 1 - ( 10 / 25 ) ) or 16.67 minutes. You can use this estimated recovery time to set Late Join option retransmit_request_generation_interval. Set it at least as high as the longest expected recovery time (don't forget to convert to milliseconds). Note that if this interval is too short, you may experience burst loss during recovery.

Note that this formula assumes the following:

  • Recovery rate is as linear as possible with use of option response_tcp_nodelay 1

  • Transmit rate (txrate) from *all* relevant sources is fairly constant and equal

  • Recovery rate (rxrate) from Late Join buffers is fairly constant and equal, and should be measured in a live test, if possible. You can adjust the recovery rate with two Late Join configuration options:



4.20.2. late_join (source)

Configure the source to enable both Late Join and Off-Transport Recovery (OTR) operation for receivers.

Scope: source
Type: int
When to Set: Can only be set during object initialization.
Value Description
1 Enable source for Late Join and OTR.
0 Disable source for Late Join and OTR. Default for all.

4.20.3. retransmit_initial_sequence_number_request (receiver)

When a late-joining receiver detects (from the topic advertisement) that a source is enabled for Late Join but has sent no messages, this flag option lets the receiver request an initial sequence number from a source. Sources respond with a TSNI.

Scope: receiver
Type: int
Default value: 1
When to Set: Can only be set during object initialization.
Version: This option was implemented in LBM 4.2.
Value Description
1 The receiver requests an initial sequence number from Late Join enabled sources that have not sent any messages. Default for all.
0 The receiver does not request an initial sequence number.

4.20.4. retransmit_message_caching_proximity (receiver)

This option enables receiver caching of new messages during a recovery. The option value determines how close or proximate the current new sequence number must be to the latest retransmitted sequence number for the receiver to start caching. The receiver recovers uncached data later in the recovery process by the retransmit request mechanism. An option value greater than or equal to the default turns on caching of new data immediately. A smaller value means that caching does not begin until recovery has caught up somewhat with the source. A larger value means that caching can begin earlier during recovery. This value has meaning for only receivers using ordered delivery of data. See Configuring Late Join for Large Numbers of Messages for additional information about this option.

Scope: receiver
Type: lbm_ulong_t
Units: messages
Default value: 2147483647
When to Set: Can only be set during object initialization.
Version: This option was implemented in LBM 3.3.2/UME 2.0.

4.20.5. retransmit_request_generation_interval (receiver)

The maximum interval between when a receiver first sends a retransmission request and when the receiver stops and reports loss on the remaining RXs not received. See Configuring Late Join for Large Numbers of Messages for additional information about this option.

Scope: receiver
Type: lbm_ulong_t
Units: milliseconds
Default value: 10000 (10 seconds)
When to Set: Can only be set during object initialization.

4.20.6. retransmit_request_interval (receiver)

The interval between retransmission request messages to the source. See Configuring Late Join for Large Numbers of Messages for additional information about this option.

Scope: receiver
Type: lbm_ulong_t
Units: milliseconds
Default value: 500 (0.5 seconds)
When to Set: Can only be set during object initialization.

4.20.7. retransmit_request_maximum (receiver)

The maximum number of messages to request, counting backward from the current latest message, when late-joining a topic. Due to network timing factors, UM may transmit an additional message. For example, a value of 5 sends 5 or possibly 6 retransmit messages to the new receiver. (Hence, you cannot request and be guaranteed to receive only 1 last message--you may get 2.) A value of 0 indicates no maximum.

Scope: receiver
Type: lbm_ulong_t
Units: messages
Default value: 0
When to Set: Can only be set during object initialization.

4.20.8. retransmit_request_outstanding_maximum (receiver)

The maximum number of messages to request at a single time from a persistent store or a source. A value of 0 indicates no maximum. See Configuring Late Join for Large Numbers of Messages for additional information about this option.

Scope: receiver
Type: lbm_ulong_t
Units: messages
Default value: 200
When to Set: Can only be set during object initialization.

4.20.9. retransmit_retention_age_threshold (source)

Specifies the minimum age of messages in the retained message buffer before UM can delete them. UM cannot delete any messages younger than this value. For UMS Late Joins, this and retransmit_retention_size_threshold are the only options that affect the retention buffer size. For UMP, these two options combined with retransmit_retention_size_limit affect the retention buffer size. UM deletes a message when it meets all configured threshold criteria, i.e., the message is older than this option (if set), and the size of the retention buffer exceeds the retransmit_retention_size_threshold (if set). A value of 0 sets the age threshold to be always triggered, in which case deletion is determined by other threshold criteria.

Scope: source
Type: lbm_ulong_t
Units: seconds
Default value: 0 (threshold always triggered)
When to Set: Can only be set during object initialization.

4.20.10. retransmit_retention_size_limit (source)

Sets a maximum limit on the size of the source's retransmit retention buffer when using a UMP store. With UMP, stability and delivery confirmation events can delay the deletion of retained messages, which can increase the size of the buffer above the retransmit_retention_size_threshold. Hence, this option provides a hard size limit. UM sets a minimum value for this option of 8K for UDP and 64K for TCP, and issues a log warning if you set a value less than the minimum.

Scope: source
Type: size_t
Units: bytes
Default value: 25165824 (24 MB)
When to Set: Can only be set during object initialization.

4.20.11. retransmit_retention_size_threshold (source)

Specifies the minimum size of the retained message buffer before UM can delete messages. The buffer must reach this size before UM can delete any messages older than retransmit_retention_age_threshold. For UMP, these options combined with retransmit_retention_size_limit affect the retention buffer size. A value of 0 sets the size threshold to be always triggered, in which case deletion is determined by other threshold criteria.

Scope: source
Type: size_t
Units: bytes
Default value: 0 (threshold always triggered)
When to Set: Can only be set during object initialization.

4.20.12. use_late_join (receiver)

Flag indicating if the receiver should participate in a late join operation or not.

Scope: receiver
Type: int
When to Set: Can only be set during object initialization.
Value Description
1 The receiver will participate in using late join if requested to by the source. Default for all.
0 The receiver will not participate in using late join even if requested to by the source.

Copyright (c) 2004 - 2014 Informatica Corporation. All rights reserved.