[ClusterLabs] Coming in 1.1.15: Event-driven alerts

Mon May 2 18:35:45 EDT 2016

On 04/25/2016 07:28 AM, Lars Ellenberg wrote:
> On Thu, Apr 21, 2016 at 12:50:43PM -0500, Ken Gaillot wrote:
>> Hello everybody,
>>
>> The release cycle for 1.1.15 will be started soon (hopefully tomorrow)!
>>
>> The most prominent feature will be Klaus Wenninger's new implementation
>> of event-driven alerts -- the ability to call scripts whenever
>> interesting events occur (nodes joining/leaving, resources
>> starting/stopping, etc.).
> 
> What exactly is "etc." here?
> What is the comprehensive list
> of which "events" will trigger "alerts"?

The exact list should be documented in Pacemaker Explained before the
final 1.1.15 release. I think it's comparable to what crm_mon -E does
currently. The basic categories are node events, fencing events, and
resource events.

> My guess would be
>  DC election/change
>    which does not necessarily imply membership change
>  change in membership
>    which includes change in quorum
>  fencing events
>    (even failed fencing?)
>  resource start/stop/promote/demote
>   (probably) monitor failure?
>    maybe only if some fail-count changes to/from infinity?
>    or above a certain threshold?
> 
>  change of maintenance-mode?
>  node standby/online (maybe)?
>  maybe "resource cannot be run anywhere"?

It would certainly be possible to expand alerts to more situations if
there is a need. I think the existing ones will be sufficient for common
use cases though.

> would it be useful to pass in the "transaction ID"
> or other pointer to the recorded cib input at the time
> the "alert" was triggered?

Possibly, though it isn't currently. We do pass a node-local counter and
a subsecond-resolution timestamp, to help with ordering.

> can an alert "observer" (alert script) "register"
> for only a subset of the "alerts"?

Not explicitly, but the alert type is passed in as an environment
variable, so the script can simply exit for "uninteresting" event types.
That's not as efficient since the process must still be spawned, but it
simplifies things.

> if so, can this filter be per alert script,
> or per "recipient", or both?
> 
> Thanks,
> 
>     Lars
>