[ClusterLabs] Alert notes

Klaus Wenninger kwenning at redhat.com
Wed Jun 15 16:45:09 UTC 2016


On 06/15/2016 06:11 PM, Ferenc Wágner wrote:
> Hi,
>
> Please find some random notes about my adventures testing the new alert
> system.
>
> The first alert example in the documentation has no recipient:
>
>     <alert id="my-alert" path="/path/to/my-script.sh" />
>
>     In the example above, the cluster will call my-script.sh for each
>     event.
>
> while the next section starts as:
>
>     Each alert may be configured with one or more recipients. The cluster
>     will call the agent separately for each recipient.
The goal of the first example is to be as simple as possible.
But of course it makes sense to mention that it is not compulsory
to ad a recipient. And I guess it makes sense to point that out
as it is just ugly to think that you have to fake a recipient while
it wouldn't make any sense in your context.
>
> and the rest of the documentation considers the recipient always
> present.  For example, in table 7.2:
>
>     CRM_alert_recipient    The configured recipient
>
> then
>
>     Alert agents will be called once per recipient.
>
> While in specialized cases it certainly makes sense that some alerts
> don't take recipients, I find it confusing that the first introductory
> example demonstrates something totally unacknowledged by the definitive
> text following it.
>
> I think the default timestamp should contain date and time zone
> specification to make it unambigous.
Idea was to have a trade-off between length and amount of
information.
>
> Did you think about filtering the environment variables passed to the
> alert scripts?  NOTIFY_SOCKET probably shouldn't be present, and PATH
> probably shouldn't contain sbin directories; I guess all these are
> inherited from systemd in my case.
It is just what crmd comes along with ... but interesting point ...
>
> I was also hit again by the "strange umask" problem here.  It's set to
> 0026, which tends to get where nobody expects it (see for example
> https://bugs.launchpad.net/fuel/+bug/1397284,
> http://clusterlabs.org/pipermail/users/2015-June/000682.html, or
> http://bugs.clusterlabs.org/show_bug.cgi?id=5268).  In practice, alert
> scripts won't often create local files, but it's a pity we have to fight
> fallout from the logfile creation again.
again heritage from crmd and an interesting point ...
>
> (BTW I'd prefer to run the alert scripts as a different user than the
> various Pacemaker components, but that would lead too far now.)
well, something we thought about already and a point where the
new feature breaks the ClusterMon-Interface.
Unfortunately the impact is quite high - crmd has dropped privileges -
but if the pain-level rises high enough ...

>
> The SNMP agent seems to have a problem with hrSystemDate, which should
> be an OCTETSTR with strict format, not some plain textual timestamp.
> But I haven't really looked into this yet.
Actually I had tried it with the snmptrap-tool coming with rhel-7.2
and it worked with the string given in the example.
Did you copy it 1-1? There is a typo in the document having the
double-quotes double. The format is strict and there are actually
2 formats allowed - on with timezone and one without. The
format string given should match the latter.






More information about the Users mailing list