Operations Guide
Startup/Shutdown Procedures

In a multicast environment, only the applications and monitoring tools need to be started. If using Persistence, the store daemon (umestored) also needs to be started. Likewise, use of the UM Router requires starting the UM Router daemon (tnwgd).

In a unicast-only environment, one or more resolver daemons (lbmrd) are typically required. It is recommended that you start the lbmrd before starting the applications.

Informatica recommends that you shutdown applications using UM sources and receivers cleanly, even though UM is able to cope with the ungraceful shutdown and restart of applications and UM daemons.

A failed assertion could lead to immediate application shutdown. If opting to restart a UM client or lbmrd, no other components need be restarted. Failed assertions should be logged with Informatica support.


Topic Resolution  <-

Your UM development or administration team should anticipate the time and bandwidth required to resolve all topics when all applications initially start. This team should also establish any restarting restrictions.

Operations staff should not have any direct topic resolution tasks aside from monitoring the increased CPU and bandwidth usage.

Topic resolution is the discovery of a topic's transport session information by a receiver to enable the receipt of topic messages. Although in a multicast environment, topic resolution does not need to be started or shutdown, it does use network resources and can cause deaf receivers and other problems if not operating properly. See Topic Resolution in the UM Concepts Guide for more detailed information.

Applications cannot deliver messages until topic resolution completes. UM monitoring statistics are active before all topics resolve. In a large topic space approximately 10,000 topics) topic resolution messages may be 'staggered' or rate controlled, taking potentially several seconds to complete.

For example, 10,000 topics at the default value of 1,000 for resolver_initial_advertisements_per_second (context) will take 10 seconds to send out an advertisement for every topic. If all receiving applications have been started first, fully resolving all topics may not take much more than 10 seconds. The rate of topic resolution can also be controlled with the resolver_initial_advertisement_bps (context) configuration option. Topic advertisements contain the topic string and approximately 110 bytes overhead. Topic queries from receivers contain no overhead, only the topic string.


UM Applications  <-

Your UM development team should provide you with the application names, resident machines and startup parameters, along with a sequence of application/daemon startups and shutdowns.

The following lists typical application startup errors.

  • Lack of resources
  • License not configured - LOG Level 3: CRITICAL: LBM license invalid [LBM_LICENSE_FILENAME nor LBM_LICENSE_INFO are set]
  • Cannot bind port - lbm_context_create: could not find open TCP server port in range.

    Too many applications may be running using the UM context's configured port range on this machine. This possibility should be escalated to your UM development team.

    Application is possibly already running. It is possible to start more than one instance of the same UM application.

  • Invalid network interface name / mask - lbm_config: line 1: no interfaces matching criteria
  • Multiple interfaces detected - LOG Level 5: WARNING: Host has multiple multicast-capable interfaces; going to use [en1][10.10.10.102]

This message appears for multi-homed machines. UM is not explicitly configured to use a single interface. This may not cause an issue but requires configuration review by your UM development team.


Indications of Possible Application Shutdown  <-

A UM application shutdown may not be obvious immediately, especially if you are monitoring scores of applications. The following lists events that may indicate an application has shutdown.

  • The Process ID disappears. Consider a method to monitor all process IDs (PIDs).
  • You notice the existence of a core dump file on the machine.
  • UM statistics appear to reduce in volume or stop flowing.
  • In an Application Log, one or more End Of Session (EOS) events signaling the cessation of a transport session. This may indicate a source application may have shut down. Your UM development team must explicitly log LBM_MSG_EOS events. Some EOS events may be delayed for some transports.
  • In an Application Log, disconnect events (LBM_SRC_EVENT_DISCONNECT) for unicast transports (if implemented) indicate UM receiver applications have shutdown.


Unicast Topic Resolver (lbmrd)  <-

If not using multicast topic resolution, one or more instances of lbmrd must be started prior to stating applications. Unicast resolver daemons require an XML configuration file and multiple resolver daemons can be specified by your UM development team for resiliency. For more information on Unicast Topic Resolution, see Unicast UDP TR.

Execute the following command on the appropriate machine to start a unicast topic resolver (lbmrd) from command line:

lbmrd --interface=ADDR -L daemon_logfile.out -p PORT lbmrd.cfg

For more information on the lbmrd command-line, see Lbmrd Man Page.

To stop the resolver, use the kill command. If a unicast resolver daemon terminates, you need to restart it.

Observe the lbmrd logfile for errors and warnings

To make the lbmrd a Windows Service, see UM Daemons as Windows Services.

If running multiple lbmrds and an lbmrd in the list becomes inactive, the following message appears in the clients' log files:

unicast resolver "<ip>:<port>" went inactive

If all unicast resolver daemons become inactive, the following message appears in the clients' log files:

No active resolver instances, sending via inactive instance

After all topics are resolved, daemons do not strictly need to be running unless you restart applications. Resolver daemons do not cache or persist state and do not require other shutdown maintenance.


Persistent Store (umestored)  <-

Stores can operate in disk-backed or memory-only mode specified in the store's XML configuration file. Disk backed stores are subject to the limitations of the disk hardware. Stores should not be run on virtual machines and each store should have a dedicated disk. UM holds messages in memory until written to disk.


Starting a Store  <-

Execute the following command on the appropriate machine to start a (umestored) from command line:

umestored config-file.xml

For more information on the umestored command-line, see Umestored Man Page.

  • Record umestored PID to monitor process presence for failure detection.
  • On Microsoft Windows®, monitor the umestored service.
  • Observe the umestored logfile for errors and warnings

In disk mode, stores create two types of files.

  • Cache file - contains the actual persisted messages, and can grow to be very large over time. It is important to ensure that there is enough disk space to record the appropriate amount of persisted data.
  • State file - contains information about the current state of each client connection and is much smaller.

Stores do not create any files in memory-only mode.

To make the Store a Windows Service, see UM Daemons as Windows Services.


Restarting a Store  <-

Perform the following procedure to restart a store.

  1. If the store is still running, kill the PID (Linux) or use the Windows Service Manager® to stop the Windows service.
  2. If you want a clean "start-of-day" start, delete the cache and state files. The location of these files is specified in the store's XML configuration file.
  3. Wait 20-30 seconds to let timeouts expire. Due to its use of connectionless protocols, Persistence depends upon timeouts. Therefore, do not rapidly restart the store.
  4. Run the command: "umestored config-file.xml". umestored automatically uses the existing cache and state files after a graceful shutdown and resumes as part of the current messaging stream at its last known position.


Common Startup and Shutdown Issues  <-

  • Cache and state directories don't exist.
  • Disk space - Cache files contain the actual persisted messages, and can grow to be very large over time. It is important to ensure that there is enough disk space to record the appropriate amount of persisted data.
  • Configuration error - UM parses a store's XML configuration file at startup, reporting errors to standard error.
  • Configuration error - UM reports other configuration errors the store's log file.
  • Missing license details.


UM Router (tnwgd)  <-

When a UM Router starts it discovers all sources and receivers in the topic resolution domains to which it connects. This results in a measurable increase and overall volume of topic resolution traffic and can take some time to complete depending upon the number of sources, receivers, and topics. The rate limits set on topic resolution also affect the time to resolve all topics.

See also Topic Resolution.


Starting a UM Router  <-

Execute the following command on the appropriate machine to start a UM Router (tnwgd) from command line:

tnwgd config-file.xml

Informatica recommends:

  • Record tnwgd PID to monitor process presence for failure detection.
  • Monitor the tnwgd logfile for errors and warnings.

For more information on the tnwgd command-line, see Tnwgd Man Page. To make the UM Router a Windows Service, see UM Daemons as Windows Services.


Restarting a UM Router  <-

Perform the following procedure to restart a UM Router.

  1. If the UM Router is still running, kill the PID.
  2. Wait 20-30 seconds to let timeouts expire. After a restart new proxy sources and receivers must be created on the UM Router. Applications will not use the new proxies until the transport timeout setting expires for the old connections. Until this happens, applications may appear to be "deaf" since they are still considering themselves as connected to the "old" UM Router proxies. Therefore, do not rapidly restart the UM Router.
  3. Run the command: tnwgd config-file.xml


UM Daemons as Windows Services  <-

On the Microsoft Windows platform, the UM daemons can be used either from the command line or as Windows Services. The available UM Services are:

Executable File Description Service Display Name Man Page
lbmrds.exe UDP-based Unicast Topic Resolver "LBMR Store Daemon" man page
srsds.exe TCP-based Topic Resolver "UM Stateful Topic Resolution Service" man page
storeds.exe Persistent Store "UME Store Daemon" man page
tnwgds.exe UM Router (DRO) "Ultra Messaging Gateway" man page

Note that the Ultra Messaging Manager daemon ("ummd") is not offered as a Windows Service at this time.

As of UM version 6.12, the above UM daemons work similarly with respect to running as a Windows Service. See the individual man pages for differences.

Before beginning, make sure that the license key is provided in a way that the service will be able to access it. In particular, if you are using an environment variable to set the license key, it must be a system environment variable, not user.

There are 4 overall steps to running a UM daemon as a Windows Service:

  1. Install the Windows Service
  2. Configure the Daemon
  3. Configure the Windows Service
  4. Start the Windows Service

All 4 steps must be completed before the Service can be used.


Install the Windows Service  <-

There are two ways to install a UM daemon as a Windows Service:

  • Product package installer.
  • Command line.

Product package installer

When installing the product using the package installer, the dialog box titled "Choose Components" provides one or more check boxes for UM daemons to be installed as services. You may check any number of the boxes and proceed with the installation.

Note that for any box not checked, the software for that daemon is still copied onto the machine. This allows for installation as a Windows Service at a later time using the Command Line method.

Also note that it is not necessary to use the package installer at all. The UM files can simply be copied to the machine. In that case, this step is skipped altogether, and the Command Line method must be used to install the daemons as Windows Services. See Copy Windows Runtime Files for details.

Command line

If a daemon was not installed as a Windows Service from the product's package installer (possibly because the package installer was not used), daemons can be installed at a later time from the command line. In this case, the Service install step is combined with the Configure the Windows Service step below.


Configure the Daemon  <-

UM daemons are configured via XML configuration files. These files must be created and managed by the user. Each individual daemon needs its own separate XML configuration file.

Informatica recommends developing and testing the daemon configuration files interactively, using the command-line interface of each daemon. Do not run the daemon as a Windows Service until the daemon configuration has been validated and tested. This provides the fastest test cycle while the configuration is being developed and finalized.

The configuration files should be located on the hosts that are intended to run the daemons in files/folders of the user's choosing.

For more information on configuring and running the daemons interactively, see:

Executable File Description Configuration Details Man Page
lbmrds.exe UDP-based Unicast Topic Resolver lbmrd Configuration File man page
srsds.exe TCP-based Topic Resolver SRS Configuration File man page
storeds.exe Persistent Store Configuration Reference for Umestored man page
tnwgds.exe UM Router (DRO) XML Configuration Reference man page


Configure the Windows Service  <-

At this point, you should have the Daemon XML configuration file(s) prepared and available on the host which is to run the desired daemon(s) (See Configure the Daemon). And you should have tested the configuration using the daemon interactively to verify is correct operation.

Each daemon's Windows Service must now be configured so that the Service can find the proper daemon configuration file.

You also need to decide how the daemon writes log messages to the Windows Event Logging system. Log messages are categorized into different severity levels: "info", "notice", "warning", "err", "alert", "emerg". By default, the daemons will write log messages of category "warning" and above to the Windows Event Log. If desired, this can be adjusted with the "-e" flag, shown below. Be aware that setting the severity level below "warning" can result in very many messages being written to the Windows Event Log. Also be aware that messages of all severity levels are written to the daemon's log file, independent of the "-e" setting.

Attention
To enter the commands below, you need a Command Prompt window running as Administrator. (One way to do this is to right-click on the Command Prompt icon and select "More > Run as administrator".)

To perform this step, each daemon service executable must be interactively executed from a command prompt with the following command-line flags. The exact command-line flag set depends on whether the daemon was already installed as a Windows Service when the UM package was installed.

Service was previously installed
Use this command if the daemon has already been installed as a Windows Service, perhaps using the product's Package Installer.
daemonexe -s config -e warn c:\daemon_config_file_path
Where "daemonexe" is the name of the desired daemon's executable (see UM Daemons as Windows Services for executable names).
Service was NOT previously installed
Use this command if the daemon has not yet been installed as a Windows Service, perhaps because the product's package installer was not used.
daemonexe -s install -e warn c:\daemon_config_file_path
Where "daemonexe" is the desired daemon's executable (see UM Daemons as Windows Services for executable names).

Some daemons have additional Service settings that should be configured prior to use. For example, the SRS has two additional log files that can only be set using command-line flags. See the individual daemon's Windows Service man page for each daemon's details (see UM Daemons as Windows Services for man page links).


Start the Windows Service  <-

Windows Services are controlled by the "Services" control panel. See your Windows documentation for information on controlling Windows Services.


Remove the Windows Service  <-

There are several ways to remove the UM daemons as Windows Services:

  • Uninstall the software (if it was installed by the Package Installer). This is done using the normal Windows "Add or Remove Programs" control panel. Performing this step removes the Windows Service, and also removes the installed files.
  • Manually remove the service using the daemon's executable program. Performing this step only removes the Windows Service; the installed files remain.
Manual Removal
Use this command to remove the daemon as a Windows Service. Note that the software files will remain.
daemonexe -s remove
Where "daemonexe" is the desired daemon's executable (see UM Daemons as Windows Services for executable names).


UM Analysis Tools  <-

Tools available to analyze UM activity and performance.


Packet Capture Tools  <-

  • Wireshark® is an open-source network packet analysis tool, for which Informatica provides 'dissectors' describing our packet formats. It is used to open and sift through packet capture files, which can be gathered by a variety of both software and hardware tools.
  • Tshark is a command-line version of Wireshark.
  • Tcpdump is the primary software method for gathering packet capture data from a specific host. It is available on most Unix-based systems, though generally gathering packet captures with the tool requires super-user permissions.

For more information about Wireshark please visit https://www.wireshark.org/. (The UM plugins are part of the current release.)


Resource Monitors  <-

  • Top is a system resource monitor available on Linux/Unix that presents a variety of useful data, such as CPU use (both average and per-CPU), including time spent in user mode, system mode, time processing interrupts, time spent waiting on I/O, etc.
  • Microsoft ®Windows® System Resource Manager manages Windows Server® 2008 processor and memory usage with built-in or custom resource policies.
  • prstat is a resource manager for Solaris® that provides similar CPU and memory usage information.


Process Analysis Tools  <-

  • pstack dumps a stack trace for a process (pid). If the process named is part of a thread group, then

    pstack traces all the threads in the group.

  • gcore generates a core dump for a Solaris, Linux, and HP-UX process. The process continues after core has been dumped. Thus, gcore is especially useful for taking a snapshot of a running process.


Network Tools  <-

  • netstat provides network statistics for a computer's configured network interfaces. This extensive command-line tool is available on Linux/Unix based systems and Windows operating systems.
  • wget is a Linux tool that captures content from a web interface, such as a UM daemon web monitor. Its features include recursive download, conversion of links for off-line viewing of local HTML, support for proxies, and more.
  • netsh is a Windows utility that allows local or remote configuration of network devices such as the interface.


UM Tools  <-

  • lbmmoncache is a utility that monitors both source notification and source/receiver statistics. Contact UM Support for more information about this utility.
  • lbtrreq restarts the topic resolution process. Contact UM Support for more information about this utility.


UM Debug Flags  <-

The use of UM debug flags requires the assistance of UM Support. Also refer to the following Knowledge Base articles for more information about using debug flags.