Guide for Persistence
|
See Example Protocol Files for the protocol buffer definition files.
The different message types are:
Each one has a specific structure associated with it, as detailed in umedmonmsgs.h.
Note that message types ending with "_CONFIG" are in the config category, while message types ending with "_STATS" are in the stats category. See Daemon Statistics Structures for information on how the two categories are handled differently.
A monitoring application receiving these messages must detect if there is an endian mismatch (see Daemon Statistics Binary Data). The header structure umestore_dmon_msg_hdr_t contains a 16-bit field named magic
which is set equal to LBM_UMESTORE_DMON_MAGIC. The receiving application should compare it to LBM_UMESTORE_DMON_MAGIC and LBM_UMESTORE_DMON_ANTIMAGIC. Anything else would represent a serious problem.
If the receiving app sees:
then it can simply access the binary fields directly. However, if it sees:
then most (but not all) binary fields need to be byte-swapped. See Example umedmon.c for an example, paying special attention to the macros COND_SWAPxx
(which conditionally swaps based on the magic test) and the functions byte_swapXX()
(which performs the byte swapping).
However, there are some binary fields which must never be swapped, regardless of the endian. This is indicated in the documentation. For example, umestore_store_dmon_config_msg_t_stct::store_iface says "NOTE: This field should NOT be byte-swapped." Here's how that field might be accessed:
As you can see, store_iface
is not byte swapped, but store_port
(conditionally) is swapped.
There are some messages which contain string buffers at the ends of the messages. Strings in these data structures are always null-terminated. Be aware that these messages are not sent as fixed-length equal to the size of the data structure, but rather are sent with only the bytes required by the string (including the final null). For example, the structure umestore_store_pattern_dmon_config_msg_t contains the field umestore_store_pattern_dmon_config_msg_t_stct::pattern_buffer which is char
array of size LBM_UMESTORE_DMON_TOPIC_PATTERN_STRLEN
. If pattern_buffer
is set to ".*", then only 3 bytes (including the null string terminator) are sent for that field.
(Contrast this with DRO Daemon Statistics String Buffers.)
This becomes more complicated when there are multiple strings in one message. For example, consider umestore_store_dmon_config_msg_t. This message contains three strings: Store name, cache directory name, and state directory name. But a single char
array is declared:
The three strings are packed into that buffer, only taking up as much space as is necessary. I.e. if the three strings are "a", "b", and "c", only 6 bytes of the buffer will be consumed (each string has a null).
To make it easier for the code to find the three strings, the structure has three offset variables: store_name_offset
, disk_cache_dir_offset
, and disk_state_dir_offset
. These are byte offsets from the start of the entire structure. So, to access the Store name, the monitoring application might use:
(The practice of using offsets from the start of the structure allows for greater flexibility in ensuring inter-version compatibility.)
There is a set of fields in umestore_store_dmon_stat_msg_t which give statistics on recovery operations initiated by receivers:
The web monitor's Store Web Monitor Store Page has a manual function labeled Reset Rate Stats which clears those "ume_retx_..._count"
fields. This is a useful function for users who use the web monitor as their primary monitoring tool, but for users who depend on the published Daemon Statistics, it can be disruptive for the counts to be cleared on-demand.
The field umestore_store_dmon_stat_msg_t_stct::ume_retx_stat_interval contains the seconds since the last Reset Rate Stats operation. If the user has not used Reset Rate Stats, then ume_retx_stat_interval
contains the seconds since the Store's startup.
There are two places in the Store configuration file that Daemon Statistics are configured:
Here is an example of configuring daemon statistics.
In this example, all stats-type messages are (conditionally) published on a 3-second interval, except those of store0, which are published (conditionally) on a 6-second interval. All config-type messages are published (unconditionally) on a 120-second interval.
The Store Process supports a monitoring application to send a specific set of requests to control the operation of Daemon Statistics, and other operations of the Store. The <remote-snapshot-request> and <remote-config-changes-request> elements control whether the Store enables the Daemon Controller operation (both default to disabled).
If enabled, the monitoring application can send a request message to the Store in the form of a topicless unicast immediate "request" message (see lbm_unicast_immediate_request() with NULL for topic). The format of the message is a simple ascii string, with or without null termination. Due to the simple format of the message, no data structure is defined for it.
When the Store receives and validates the request, it sends a UM response message back to the requesting application containing a status message (which is not null-terminated). If the status was OK, the Store also performs the requested action.
Since Daemon Control Requests are sent as UIM messages, you must use a target string to address the request to the desired Store Process. The general form of a UIM target address is described in UIM Addressing, but is illustrated by this example:
TCP:10.29.3.46:12009
where 10.29.3.46:12009 is the IP and Port of the Daemon Control context UIM port. These are typically configured using the request_tcp_interface (context) and request_tcp_port (context) options in the LBM configuration file specified by the UMP Element "<lbm-config>" contained within the UMP Element "<daemon-monitor>".
The example program Example umedcmd.c demonstrates the correct way to send the messages and receive the responses. See umedcmd Man Page for usage details.
REQUEST TYPES ENABLED BY <remote-snapshot-request>:
REQUEST TYPES ENABLED BY <remote-config-changes-request>:
A Store Process can have multiple Store instances. But the UIM message is sent to the Daemon Control context within the Store Process.
Except as noted, the following requests can either be applied to all Store instances in the Store Process, or to just one Store instance. To apply the request to one Store instance, the Store name (as specified in the UMP Element "<store>" attribute "name") should be specified in double quotes.
memory 5
"store1" src 5
"store1" rcv 5
"store1" disk 5
"store1" store 5
"store1" config 5
For the following requests, a Store instance must be supplied as part of the request. It is supplied as an IP and Port, as specified in the UMP Element "<store>" attributes "interface" and "port". Note that the following requests are not related to the Daemon Statistics feature, but are nonetheless enabled by <remote-config-changes-request>.
mark 10.29.3.16 12000 127025183 500
deregister 10.29.3.16 12000 127025183 127025184
There are occasions when a user might want to mark one or more messages in a Store's repository as invalid, to prevent them from being delivered to a recovering receiver. This can be useful if a misbehaving publisher sends a "poison" message that causes receivers to crash; having that message in the Store's repository means that restarting the failed receiver will just cause it to crash again when the message is recovered.
This message marking feature is provided by the daemon command-and-control feature Store Daemon Control Requests. Note that if there is more than one Store instance in this QC group, the request needs to be sent multiple times, once for each Store instance IP/Port.
Daemon Control requests can be sent by the example program Example umedcmd.c. Alternatively, that program's source code can be used as a guide for writing your own Store management program. See umedcmd Mark Mode for full details.
There are occasions when a user might want to deregister a failed receiver from a Store. This will delete the Store's state information for that receiver.
This receiver deregistration feature is provided by the Daemon command-and-control feature Store Daemon Control Requests.
A receiver's state information is stored per-source. For example, if an application creates a persistent receiver for topic X, and there are two sources for topic X, the Store will save two sets of state information for that receiver, one for each source for X. To fully clean up a failed receiving application, you need to deregister every pairing of receiver registration ID (RegID) associated with that receiver with every source RegID. And that must be repeated for each Store instance that the receiver was registered with. (Session IDs may not be used.)
Once deregistered, the state and cache files are deleted and cannot be restored.
Note that if there is more than one Store instance in this QC group, the request needs to be sent multiple times, once for each Store instance IP/Port.
Daemon Control Requests can be sent by the example program Example umedcmd.c. Alternatively, that program's source code can be used as a guide for writing your own Store management program. See umedcmd Deregister Mode for full details.
The umedcmd
example program sends Daemon Control Requests to a Store Process. Source code for umedcmd
can be found with the other example programs; see Example umedcmd.c.
umedcmd
on Windows. See Known Issue 10897.The umedcmd
command has 3 modes of operation:
Each mode has a different usage pattern, which is determined by the value passed to the "-m" command-line option.
This form of the umedcmd command is used to control the publishing of Daemon Statistics by the Persistent Store.
******************************************************************************* Usage: umedcmd -m publish -c config_file -T target_string [-L linger] [command_string] Available options: -c, --config=FILE Use LBM configuration file FILE. Multiple config files are allowed. Example: '-c file1.cfg -c file2.cfg' -h, --help display this help and exit -L, --linger=NUM linger for NUM seconds before closing context -m, --mode=TYPE set the command mode to TYPE 'publish' [required] -T, --target=TARGET TARGET string for unicast immediate messages [required] *******************************************************************************
The "-m mode" command-line option is optional in this usage. If supplied, it must be supplied as "-m publish". Omitting it defaults to publish mode.
The "-T target_string" contains the unicast immediate message destination address of the Daemon Control context UIM port (see Store Daemon Control Request Addressing).
The parameter "command_string" is optional. If supplied, it should be enclosed in single quotes. If omitted, the program enters an interactive mode in which the user can enter any number of commands (when used interactively, do not enclose the command string in single quotes). In interactive mode, use "h" for a brief help screen, and "q" to quit.
Valid command strings are:
*********************************************************************************** * Publish Mode * * help (print this message): h * * quit (exit application): q * * report store dmon version: version * * set publishing interval: memory 0-N * * ["store name"] src 0-N * * ["store name"] rcv 0-N * * ["store name"] disk 0-N * * ["store name"] store 0-N * * ["store name"] config 0-N * * snapshot all groups: ["store name"] snap memory|src|rcv|disk|store|config * ***********************************************************************************
Note that most of the commands can optionally be preceded by a Store instance name in double quotes. Supplying it causes the command to apply only to the named Store instance. Omitting this causes the command to apply to all Store instances in the target Store Program.
For example:
umedcmd -c dstats.cfg -m publish -T TCP:10.29.3.16:12009 '"store1" src 5'
In this example, the Store Process's Daemon Control context has its UIM port configured as 12009 (see Store Daemon Control Request Addressing), and the Store instance is configured for the name "store1". (with the UMP Element "<store>" attribute "name"). The source repository statistics are set to a publishing interval of 5 seconds.
This form of the umedcmd command is used to mark persisted messages as invalid. This prevents their delivery to recovering receivers. See Request: Mark Stored Message Invalid.
******************************************************************************* Usage: umedcmd -m mark -c config_file -i store_interface -p store_port -s src_regid -T target_string [-L linger] [-S sqn_string] Available options: -c, --config=FILE Use LBM configuration file FILE. Multiple config files are allowed. Example: '-c file1.cfg -c file2.cfg' -h, --help display this help and exit -i, --store_interface store interface IPv4 address [required] -p, --store_port store port [required] -L, --linger=NUM linger for NUM seconds before closing context -m, --mode=TYPE set the command mode to TYPE 'mark' [required] -s, --src_regid=ID source registration ID associated with the store repository [required] -S, --sqn_string=LIST LIST of one or more message sequence number(s) or ranges to drop], e.g.: '-S 54' drop a single message '-S 312-315' drops a range of messages '-S 2,5,7-9' drops two single and a range of messages -T, --target=TARGET TARGET string for unicast immediate messages [required] *******************************************************************************
The "-m mark" command-line option must be supplied.
The "-T target_string" contains the unicast immediate message destination address of the Daemon Control context UIM port (see Store Daemon Control Request Addressing).
The "-i store_interface" and "-p store_port" are required parameters which identify the desired Store instance within the Store Process, as specified in the UMP Element "<store>" attributes "interface" and "port".
The "-s src_regid" parameter is required to identify the specific source that sent the invalid message.
The command-line option "-S sqn_string" specifies the sequence number(s) of the messages that should be marked invalid. If omitted, the program enters an interactive mode in which the user can enter any number of sequence number strings.
A sequence number string can specify multiple sequence numbers and/or ranges of sequence numbers. A range is two sequence numbers separated by a dash. The string can consist of one or more sequence numbers or ranges, separated by commas. The string should be enclosed in quotes. For example:
-S "100,110-112,220"
This specifies sequence numbers 100, 110, 111, 112, 220. Note that the umedcmd
command parses the sequence number string and issues a separate request to the Store instance for each sequence number.
If "-S sqn_string" is omitted from the command line, the program enters an interactive mode in which the user can enter any number of sequence number strings. In interactive mode, use "h" for a brief help screen, and "q" to quit.
For example:
umedcmd -c dstats.cfg -m mark -T TCP:10.29.3.16:12009 -i 10.29.3.16 -p 12000 -s 127025183 -S "500"
In this example, the Store Process's Daemon Control context has its UIM port configured as 12009 (see Store Daemon Control Request Addressing), and the Store instance is configured for port 12000 (with the "port" attribute of the UMP Element "<store>"). The source registration ID is 127025183. The message with sequence number 500 is marked invalid.
This form of the umedcmd command is used to deregister a failed receiver. This deletes the state information for that receiver. See Request: Deregister Receiver.
******************************************************************************* Usage: umedcmd -m deregister-c config_file -i store_interface -p store_port -s src_regid -T target_string [-r rcvr_regid] [-L linger] Available options: -c, --config=FILE Use LBM configuration file FILE. Multiple config files are allowed. Example: '-c file1.cfg -c file2.cfg' -h, --help display this help and exit -i, --store_interface store interface IPv4 address [required] -p, --store_port store port [required] -L, --linger=NUM linger for NUM seconds before closing context -m, --mode=TYPE set the command mode to TYPE 'deregister' [required] -r, --rcvr_regid=LIST LIST of one or more receiver registration IDs with store repository, e.g.: '-r 127025171' deregister single receiver '-r 127025171, 127025162' deregister two receivers -s, --src_regid=ID source registration ID associated with the store repository [required] -T, --target=TARGET TARGET string for unicast immediate messages [required] *******************************************************************************
The "-m deregister" command-line option must be supplied.
The "-T target_string" contains the unicast immediate message destination address of the Daemon Control context UIM port (see Store Daemon Control Request Addressing).
The "-i store_interface" and "-p store_port" are required parameters which identify the desired Store instance within the Store Process, as specified in the UMP Element "<store>" attributes "interface" and "port".
The "-s src_regid" and "-r rcv_regid" parameters combine to identify the specific receiver state that will be deleted. Receiver state is stored according to a pair of registration IDs: source, receiver. (Session IDs may not be used.) For example, lets say there are two persisted sources for the same topic with registration IDs 100 and 200. A receiver with registration ID 300 will have two sets of state: state for the pair 100, 300 and state for the pair 200, 300.
Note that the "-r rcv_regid" parameter can have a comma-separated list of receiver registration IDs. This is handy if you need to de-register all receivers for a particular source. The rcv_regid should be enclosed in quotes. Note that umedcmd
command parses the receiver registration IDs and issues separate request to the Store instance for each rcv_regid.
Also note that if "-r rcv_regid" is omitted from the command line, the program enters an interactive mode in which the user can enter any number of receiver registration IDs. In interactive mode, use "h" for a brief help screen, and "q" to quit.
For example:
umedcmd -c dstats.cfg -m deregister -T TCP:10.29.3.16:12009 -i 10.29.3.16 -p 12000 -s 127025183 -r "127025184"
In this example, the Store Process's Daemon Control context has its request port configured as 12009 (see Store Daemon Control Request Addressing), and the Store instance is configured for port 12000 (with the "port" attribute of the UMP Element "<store>"). The source registration ID is 127025183 and the receiver registration ID is 127025184. This pair of registration IDs is used by the Store instance to delete the receiver state.