[ClusterLabs] Coming in 1.1.15: Event-driven alerts

Ken Gaillot kgaillot at redhat.com
Thu Apr 21 13:50:43 EDT 2016

Hello everybody,

The release cycle for 1.1.15 will be started soon (hopefully tomorrow)!

The most prominent feature will be Klaus Wenninger's new implementation
of event-driven alerts -- the ability to call scripts whenever
interesting events occur (nodes joining/leaving, resources
starting/stopping, etc.).

This is the improved successor to both the ClusterMon resource agent and
the experimental "notification-agent" feature that has been in the
upstream master branch.

The new feature was renamed to "alerts" to avoid confusion with the
unrelated "notify" resource action.

High-level tools such as crm and pcs should eventually provide an easy
way to configure this, but at the XML level, the cluster configuration
may now contain an alerts section:


The alerts section can have any number of alerts, which look like:

   <alert id="alert-1"

      <recipient id="alert-1-recipient-1"
                 value="/var/log/cluster-alerts.log" />


As always, id is simply a unique label for the entry. The path is an
arbitrary file path to an alert script. Existing external scripts used
with ClusterMon resources will work as alert scripts, because the
interface is compatible.

We intend to provide sample scripts in the extra/alerts source
directory. The existing pcmk_notify_sample.sh script has been moved
there (as pcmk_alert_sample.sh), and so has pcmk_snmp_helper.sh.

Each alert may have any number of recipients configured. These values
will simply be passed to the script as arguments. The first recipient
will also be passed as the CRM_alert_recipient environment variable, for
compatibility with existing scripts that only support one recipient.
(All CRM_alert_* variables will also be passed as CRM_notify_* for
compatibility with existing ClusterMon scripts.)

An alert may also have instance attributes and meta-attributes, for example:

   <alert id="alert-1"

      <meta_attributes id="alert-1-meta">
         <nvpair id="alert-1-timeout" name="timeout" value="10s" />

      <instance_attributes id="alert-1-vars">
        <nvpair id="alert-1-vars-1" name="magic" value="1" />
        <nvpair id="alert-1-vars-2" name="something" value="true" />

      <recipient id="alert-1-recipient-1"
                 value="/var/log/cluster-alerts.log" />


The meta-attributes are optional properties used by the cluster.
Currently, they include "timeout" (which defaults to 30s) and
"tstamp_format" (which defaults to "%H:%M:%S.%06N", and is a
microsecond-resolution timestamp provided to the alert script as the
CRM_alert_timestamp environment variable).

The instance attributes are arbitrary values that will be passed as
environment variables to the alert script. This provides you a
convenient way to configure your scripts in the cluster, so you can
easily reuse them.

In the current implementation, meta-attributes and instance attributes
may also be specified within the <recipient> block, in which case they
override any values specified in the <alert> block when sent to that
recipient. Whether this stays in the final 1.1.15 release or not depends
on whether people find this to be useful, or confusing.

Sometime during the 1.1.15 release cycle, the previous experimental
interface (the notification-agent and notification-recipient cluster
properties) will be disabled by default at compile-time. If you are
compiling the master branch from source and require that interface, you
can define RHEL7_COMPAT when building, to enable support.

This feature is already in the upstream master branch, and will be in
the forthcoming 1.1.15-rc1 release candidate. Everyone is encouraged to
try it out and give feedback.
Ken Gaillot <kgaillot at redhat.com>

More information about the Users mailing list