[ClusterLabs] Antw: Coming in 1.1.15: Event-driven alerts

Fri Apr 22 07:43:00 UTC 2016

On 04/22/2016 08:16 AM, Ulrich Windl wrote:
>>>> Ken Gaillot <kgaillot at redhat.com> schrieb am 21.04.2016 um 19:50 in Nachricht
> <571912F3.2060104 at redhat.com>:
>
> [...]
>> The alerts section can have any number of alerts, which look like:
>>
>>    <alert id="alert-1"
>>           path="/srv/pacemaker/pcmk_alert_sample.sh">
>>
>>       <recipient id="alert-1-recipient-1"
>>                  value="/var/log/cluster-alerts.log" />
>>
>>    </alert>
> Are there any parameters supplied for the script? For the XML: I think "path" for the script to execute is somewhat generic: Why not call it "exec" or something like that? Likewise for "value": Isn't "logfile" a better name?
exec has a certain appeal...
but recipient can actually be anything like email-address, logfile, ... so
keeping it general like value makes sense in my mind
>
>> As always, id is simply a unique label for the entry. The path is an
>> arbitrary file path to an alert script. Existing external scripts used
>> with ClusterMon resources will work as alert scripts, because the
>> interface is compatible.
>>
>> We intend to provide sample scripts in the extra/alerts source
>> directory. The existing pcmk_notify_sample.sh script has been moved
>> there (as pcmk_alert_sample.sh), and so has pcmk_snmp_helper.sh.
>>
>> Each alert may have any number of recipients configured. These values
> What I did not understand is how an "alert" is related to some cluster "event": By ID, or by some explict configuration?
There are "node", "fencing" and "resource" (CRM_alert_kind tells you
if you want to know inside a script) alerts and alerts was chosen
as it is in sync with other frameworks like nagios, ... but you can choose
it a synonym for event ... meaning it is not necessarily anything bad
or good just something you might be interested in.

You get set a bunch of environment variables when your executable is
called you can use to get more info and add intelligence if you like:

CRM_alert_node, CRM_alert_nodeid, CRM_alert_rsc, CRM_alert_task,
CRM_alert_interval, CRM_alert_desc, CRM_alert_status,
CRM_alert_target_rc, CRM_alert_rc, CRM_alert_kind,
CRM_alert_version, CRM_alert_node_sequence
CRM_alert_timestamp

Referencing is done via node-names, resource-ids as throughout
the pacemaker-config in the cib.

>
>> will simply be passed to the script as arguments. The first recipient
>> will also be passed as the CRM_alert_recipient environment variable, for
>> compatibility with existing scripts that only support one recipient.
>> (All CRM_alert_* variables will also be passed as CRM_notify_* for
>> compatibility with existing ClusterMon scripts.)
>>
>> An alert may also have instance attributes and meta-attributes, for example:
>>
>>    <alert id="alert-1"
>>           path="/srv/pacemaker/pcmk_alert_sample.sh">
>>
>>       <meta_attributes id="alert-1-meta">
>>          <nvpair id="alert-1-timeout" name="timeout" value="10s" />
>>       </meta_attributes>
>>
>>       <instance_attributes id="alert-1-vars">
>>         <nvpair id="alert-1-vars-1" name="magic" value="1" />
>>         <nvpair id="alert-1-vars-2" name="something" value="true" />
>>       </instance_attributes>
>>
>>       <recipient id="alert-1-recipient-1"
>>                  value="/var/log/cluster-alerts.log" />
>>
>>    </alert>
>>
>> The meta-attributes are optional properties used by the cluster.
>> Currently, they include "timeout" (which defaults to 30s) and
>> "tstamp_format" (which defaults to "%H:%M:%S.%06N", and is a
>> microsecond-resolution timestamp provided to the alert script as the
>> CRM_alert_timestamp environment variable).
>>
>> The instance attributes are arbitrary values that will be passed as
>> environment variables to the alert script. This provides you a
>> convenient way to configure your scripts in the cluster, so you can
>> easily reuse them.
> At the moment this sounds quite abstract, yet.
meta-attributes and instance-attributes as used as with
resources, where meta-attributes reflect config-parameters
you pass rather to pacemaker like in this case for the timeout
observation when the script is executed, and the format
string that tells pacemaker in which style you would like
CRM_alert_timestamp to be filled.
By the way this timestamp is created immediately before all alerts
are fired off in parallel so to be usable for analysis of what happened
in which order in the cluster - much better than using date inside
a script running as separate process possibly having been delayed.

instance-attributes you can use to tell your script whatever
you like but it is visible and synchronized throughout the
cluster residing in the cib.
>> In the current implementation, meta-attributes and instance attributes
>> may also be specified within the <recipient> block, in which case they
>> override any values specified in the <alert> block when sent to that
>> recipient. Whether this stays in the final 1.1.15 release or not depends
>> on whether people find this to be useful, or confusing.
> Could you give one complete example (configuration and script), even if it's just as a sample for discussion?
>
> ANd will the DTD version number be incremented this time? ;-)
pcmk_alert_sample.sh is not a bad example for the use of the
environment variables set per default - although at the moment
it is still using the deprecated CRM_notify_... naming (instead of
CRM_alert_...) which is still in for compatibility reasons with
scripts made for the 2 predecessor implementations.

Another example for a config - also showing instance-attributes including
overwriting them inside the recipient-section would be:

<configuration>
  <alerts>
    <alert id="notify_9"
path="/usr/share/pacemaker/tests/pcmk_alert_sample.sh">
      <meta_attributes id="meta_9">
        <nvpair id="tstamp9" name="tstamp_format" value="%H:%M:%S.%06N"/>
      </meta_attributes>
      <instance_attributes id="global_vars_9">
        <nvpair id="global_var9_1" name="variable1" value="1"/>
        <nvpair id="global_var9_2" name="global2" value="1"/>
      </instance_attributes>
      <recipient id="recipient_9" value="/tmp/alerts.log">
        <instance_attributes id="local_vars_9">
          <nvpair id="local_var9_1" name="variable2" value="2"/>
          <nvpair id="local_var9_2" name="global1" value="overwritten"/>
        </instance_attributes>
      </recipient>
    </alert>
  </alerts>
</configuration>

To get a feeling you can add "set >> /tmp/set.txt" somewhere to
pcmk_alert_sample.sh.
But it is actually simple - just use them as environment-variables with
the name you
specified - without an prepending-tag or anything.

Yes the cib has version 2.5 now

The feature is gonna receive an update in "Pacemaker Explained"
which I'm intending to have a maybe more snappy example as well.

>
>> Sometime during the 1.1.15 release cycle, the previous experimental
>> interface (the notification-agent and notification-recipient cluster
>> properties) will be disabled by default at compile-time. If you are
>> compiling the master branch from source and require that interface, you
>> can define RHEL7_COMPAT when building, to enable support.
>>
>> This feature is already in the upstream master branch, and will be in
>> the forthcoming 1.1.15-rc1 release candidate. Everyone is encouraged to
>> try it out and give feedback.
>
> Regards,
> Ulrich
>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org