Guide for Persistence
Persistence Architecture

As shown in the diagram, UM provides messaging functionality as well as persistent operation.

persistent_architecture.png

The highlights of this architecture are:

  • Sources communicate with stores
  • Receivers communicate with stores
  • Sources communicate with receivers

Note that the store is not supported on all platforms. For example, while OpenVMS supports persistent clients (source and receiver), you cannot run a store on an OpenVMS system. However, an OpenVMS-based client can interoperate with a store running an any other supported platform.


Persistent Store Architecture  <-

The umestored daemon runs the persistent store feature. You can configure multiple stores per daemon using the '<store>' element in the umestored XML configuration file. See Configuration Reference for Umestored. Individual stores can use separate disk cache and disk state directories and be configured to persist messages for multiple sources (topics), which are referred to as, source repositories. UM provides each umestored daemon with a Web Monitor for statistics monitoring. See Store Web Monitor.

store_architecture.png


Source Repositories  <-

Within a store, you configure repositories for individual topics and each can have their own set of '<topic>' level options that affect the repository's type, size, liveness behavior and much more. If you have multiple sources sending on the same topic, UM creates a separate repository for each source. UM uses the repository options configured for the topic to apply to each source's repository. If you specify 48MB for the size of the repository and have 10 sources sending on the topic, the persistent store requires 480MB of storage for that topic.

A repository can be configured as one of the following types:

  • no cache - the repository does not retain any data, only state information
  • memory - the repository maintain both state and data only in memory
  • disk - the repository maintains state and data on disk, but also uses a memory cache.
  • reduced-fd - the repository maintains state and data on disk, also uses a memory cache but uses significantly fewer File Descriptors. Normally a store uses two File Descriptors per topic in addition to normal UM file descriptors for transports and other objects. The reduced-fd repository type uses 5 File Descriptors for the entire store, regardless of the number of topics, in addition to normal UM file descriptors for transports and other objects. Use of this repository type may impact performance.

You can configure any combination of repository types within a single store configuration.


Repository Thresholds and Limits  <-

Repositories are designed as circular buffers. When age or size thresholds are met for a topic, the repository removes or overwrites messages in order to prevent reaching its configured limit, which keeps space available for new messages. UM provides UM configuration options and store configuration options to control threshold and limit behavior.

UM configuration options control source repositories for all the sources sending within the context. The default for these options, listed below, are 0 (zero) which makes the like-name option for the repository in the umestored XML configuration file active.

See Ultra Messaging Persistence Options.

Note: The above configuration options' default values can be altered for individual sources and receivers by calling lbm_src_topic_attr_setopt() before you allocate the topic.

The umestored configuration options for source/topic repositories explained below can also be used to control threshold and limit behavior. See Options for a Topic's ume-attributes Element for complete information about the following repository options.

Note
Whether you use the UM configuration options mentioned above or the source repository options explained below to control source repository threshold and limit behavior, remember the values you configure apply to a single source sending to the store. If you use the default repository size limit of 48 MB and you have 1,000 sources sending to the store, UM creates a store with 1,000 source repositories of 48 MB each, which requires a store with approximately 48 GB of memory. And if you use the default disk file size limit of 100 MB and you have 1,000 sources sending to the store, UM creates a store with 1,000 source repositories of 100 MB each, which requires a store with disk storage capacity of approximately 100 GB.

Memory Repository

A memory type source repository has three configuration options that manage its size relative to its capacity.

  • repository-age-threshold - This value determines how long the repository retains messages. The repository deletes any message older than this configured value.

  • repository-size-threshold - The size in bytes that a repository can reach before it begins to delete the oldest retained messages. If the repository size falls below the threshold, it stops deleting old messages.

  • repository-size-limit - The maximum size in bytes for the repository. Once this limit is reached, the repository stops accepting new messages. The age and size thresholds should be set at levels that guarantee the size limit is never met. You should consider how fast the source sends messages, the size of the messages and the reliability of the receivers. For example, more reliable receivers mean less recovery instances, which could mean a younger age threshold.

    Disk or Reduced-fd Repositories

    A disk or reduced-fd type source repository maintains a memory cache in addition to the actual disk storage. It continually persists messages from the memory cache to the disk, and uses the memory cache for receiver recovery first before performing disk reads to access needed messages. It has four configuration options that manage its size relative to its capacity.

    • repository-age-threshold - This value determines how long the disk repository retains messages in its memory cache. The repository deletes any message from memory cache older than this configured value. These messages could have been persisted to disk and may be available for recovery.

    • repository-size-threshold - The size in bytes that a repository can reach before it begins to delete the oldest retained messages. These messages could have been persisted to disk and may be available for recovery. If the disk repository memory cache size falls below the threshold, it stops deleting old messages.

    • repository-size-limit - The maximum size in bytes for the disk repository's memory cache. Once this limit is reached, the repository stops accepting new messages. The age and size thresholds should be set at levels that guarantee the size limit is never met. You should consider how fast the source sends messages, the size of the messages and the reliability of the receivers. For example, more reliable receivers mean less recovery instances, which could mean a younger age threshold.

    • repository-disk-file-size-limit - The maximum disk space (in bytes) for the disk repository. Once this limit is reached, the repository overwrites old messages with new messages. Overwriting old messages is not necessarily a negative situation provided you disk file size is adequate. However, if messages needed for recovery are not in either the memory cache or the disk file, you may need to increase the disk file size to ensure that overwritten messages are no longer needed for receiver recovery.


Tolerance Persistent Store Fault Tolerance  <-

Sources and receivers register with a store and use individual repositories within the store. Sources can use redundant repositories configured in multiple stores in Quorum/Consensus arrangement for fault tolerance. Be aware that the arrangement of stores into Quorum/Consensus groups is a function of the source. I.e. the individual stores of a Quorum/Consensus group are not aware of each other and do not coordinate their activities.


Identifying Persistent Stores  <-

You can identify stores with either a domainID:interface:port, interface:port or a name. Using only interface:port is more feasible in smaller implementations where the smaller number of possible IP addresses is easier to manage. Larger implementations, especially those that span topic resolution domains using UM Routers, are better served with stores identified by a name or domainID:interface:port.

UM automatically resolves and maintains a mapping between a store name and a single topic resolution domain, IP address and port. UM also automatically resolves store names if the store is located across one or more UM Routers in a different topic resolution domain.

The following lists other specifics of store identification.

  • Store sends ads at startup and in response to queries from sources.
  • If a store receives a context name advertisement that matches its own store name, umestored issues a warning in the store's log.
  • Sources using named stores issue an information message to the application every time a resolved context name changes its DomainID:IPaddress:port.

Using a Single Interface and Port

Configure store for a single interface and port.

  1. Identify the store with only the interface:port, specified in umestored configuration file.

    <store name="newyork-1" port="14567" interface="10.29.3.16">

  2. Add the interface:port to ume_store (source) so sources can find and register with the store.
    source ume_store 10.29.3.16:14567

To run the store on a different machine for any reason, you must change both the umestored XML configuration file and the UM configuration file.

Using a Range of Interfaces

Configure a store with a range of IP addresses.

  1. Identify the store with a range of interfaces specified in the umestored configuration file.

    <store name="newyork-1" port="14567" interface="10.29.3.16/25">`

  2. Add the active interface to ume_store (source) so sources can find and register with the store. You can only specify one interface in the configuration file.
    source ume_store 10.29.3.16:14567

To run the store on a different machine, you must only change the interface specified in the ume_store (source) UM configuration option, provided you use one of the interfaces in the range specified in in the umestored configuration file.

Using a Store (context) Name

Configure a store with a name instead of just IP:port. '0.0.0.0' (INADDR_ANY) or no value is the default for the store's interface attribute.

  1. Identify the store with a context-name option that resolves to the interface and port - or range of interfaces and port - specified in the umestored configuration file:

    <store name="newyork-1" port="14567" interface="0.0.0.0">
    <ume-attributes>
    <option name="context-name" type="store" value="NEWYORK-1"/>
    </ume-attributes>

    OR

    <store name="newyork-1" port="14567" interface="10.29.3.16">
    <ume-attributes>
    <option name="context-name" type="store" value="NEWYORK-1"/>
    </ume-attributes>

    OR

    <store name="newyork-1" port="14567" interface="10.29.3.16/25">
    <ume-attributes>
    <option name="context-name" type="store" value="NEWYORK-1"/>
    </ume-attributes>

  2. Add the store's context name to ume_store_name (source) so sources can find and register with the store.

    source ume_store_name NEWYORK-1

You do not have to make any configuration changes to run NEWYORK-1 on another machine, provided the new interface matches one of those specified in the umestored configuration file. This includes running the store in a different topic resolution domain.