Guide for Persistence
|
In discussing Persistence, we refer to specific recovery from the failures of sources, receivers, and Persistent Stores.
The user can choose between two different persistence modes:
UM uses a daemon program known as the Store to persist source (publisher) and receiver (subscriber) state. A Store instance can persist state in memory as well as on disk. State is persisted on a per-topic, per-source basis by the Store. Along with each publisher's state is a message cache containing the full message contents of recently-sent messages by the source.
The purpose of the Store is to allow receivers to recover messages that the receiver was not able to get directly from the source.
The Store is an independent component, not part of the source. If a persistent publisher fails, that source's messages are maintained by the Store according to configurable retention policies.
Note that the design of UM's persistence allows a maximum of 2,147,483,647 messages (2**31 - 1) to be persisted.
Stores can be configured to be disk-based or memory-only. A disk-based Store uses memory as temporary storage while messages are written to disk. Memory-only Stores only hold messages in memory. The memory-only Stores have higher throughput, while disk-based Stores have greater message capacity.
Note that most UM deployments only use disk-based Stores. Most of this document is written with that assumption.
It is important to remember the different kinds of configuration.
So Stores need two kinds of configuration files: Store configuration files and LBM configuration files. Applications only need LBM configuration files.
UM persistence identifies sources and receivers with Registration Identifiers, also called Registration IDs or RegIDs. A RegID is a 32-bit number that uniquely identifies a source or a receiver to a Store instance. This means that RegIDs are also specific to a Store instance and can be reused between individual Store instances, if needed. No two active sources or receivers can share a RegID or use the same RegID at the same time. This point is critical: since UM enables your application to use and handle RegIDs very freely, you must use RegIDs carefully to avoid destructive results.
While RegIDs can be managed directly by applications, Informatica recommends the use of Session IDs instead. See Managing RegIDs with Session IDs.
A persistent receiver provides confirmation (acknowledgement) to the Store instance as it consumes (processes) messages. This is fundamental to the design of UM persistence.
The receiver can optionally provide this confirmation (acknowledgment) to the persistent source. These confirmations are turned off by default, but can be requested through either or both two LBM configuration options:
These two options are unrelated to each other, except that they both request the receiver to send delivery confirmations. Note that when either or both of the options are set, the persistent source requests that the persistent receiver supply delivery confirmations. The persistent receiver has the option to decline the request by setting the option ume_allow_confirmed_delivery (receiver) to 0.
The latter LBM option, ume_retention_unique_confirmations (source), can provide a form of receiver-pacing; the source will not be allowed to exceed Persistence Flight Size beyond receiving applications. For more information, see: Confirmed Delivery
Sources and Persistent Stores retain messages in memory according to a set of rules collectively called the retention policy. The rules specify when UM will remove a message from memory, an action called "reclaiming" (because the memory is reclaimed from the buffer). Note that reclaiming a message from memory does not mean the message can no longer be recovered. The opposite is true - a message is reclaimed from memory only after it is stable on the Stores.
A message must satisfy every rule before it can be reclaimed. Conversely, any message not complying with all rules will not be reclaimed. A source or Store instance retains messages in memory until its retention policy dictates the message may be removed. Sources and Stores use slightly different retention policies based on their individual roles.
For more information, see Source Message Retention and Release.
Sources send messages to both receivers and to Store instances. Messages become stable once the message has been persisted at the Store or a set of Stores, and those Stores acknowledge stability to the sources. Since it takes time to write messages to disk and signal stability, the source is allowed to continue sending messages while waiting for stability acknowledgements. Any messages sent but not yet acknowledged are said to be "in flight". The number of in-flight messages is normally limited. For more information, see Persistence Flight Size.
In addition, UM informs the application when messages are stabilized. Until that stability acknowledgement is received, the source can not assume the messages will be successfully delivered. The message stability acknowledgement is vital to ensuring that messages will not be lost. For more information, see Source Message Retention and Release.
Typically, multiple Store instances are deployed as a group for redundant operation. In this configuration, one or more Stores (or the hosts they run on) can fail without impacting the message flow from sources to receivers, as long as a quorum of the configured Stores is operational. UM defines a quorum as a majority of the configured Stores. E.g. if 3 Store instances are configured, messaging can continue as long as at least 2 are operational. If 5 Store instances are configured, messaging can continue if at least 3 are operational. (Quorum/Consensus requires an odd number of Store instances in the QC group.)
Sources define the QC group by the LBM configuration option ume_store (source), one for each Store in the group.