[ClusterLabs Developers] Proposed future feature: multiple notification scripts

Jan Pokorný jpokorny at redhat.com
Fri Dec 4 16:32:09 UTC 2015


On 04/12/15 12:33 +1100, Andrew Beekhof wrote:
>> On 4 Dec 2015, at 2:45 AM, Jan Pokorný <jpokorny at redhat.com> wrote:
>> On 02/12/15 17:23 -0600, Ken Gaillot wrote:
>>> This will be of interest to cluster front-end developers and anyone who
>>> needs event notifications ...
>>> 
>>> One of the new features in Pacemaker 1.1.14 will be built-in
>>> notifications of cluster events, as described by Andrew Beekhof on That
>>> Cluster Guy blog:
>>> http://blog.clusterlabs.org/blog/2015/reliable-notifications/
>>> 
>>> For a future version, we're considering extending that to allow multiple
>>> notification scripts, each with multiple recipients. This would require
>>> a significant change in the CIB. Instead of a simple cluster property,
>>> our current idea is a new configuration section in the CIB, probably
>>> along these lines:
>>> 
>>> <configuration>
>>>   <!-- usual crm_config etc. here -->
>>> 
>>>   <!-- this is the new section -->
>>>   <notifications>
>>> 
>>>      <!-- each script would be in a notify element -->
>>>      <notify id="notify-1" path="/my/script.sh" timeout="30s">
>>> 
>>>         <recipient id="recipient-1" value="me at example.com" />
>>>         <!-- etc. for multiple recipients -->
>>> 
>>>      </notify>
>>> 
>>>      <!-- etc. for multiple scripts -->
>>> 
>>>   </notifications>
>>> </configuration>
>>> 
>>> 
>>> The recipient values would be passed to the script as command-line
>>> arguments (ex. "/my/script.sh me at example.com").
>> 
>> Just thinking out loud, Pacemaker is well adapted to cope with
>> asymmetric/heterogenous nodes (incl. user-assisted optimizations
>> like with non-default "resource-discovery" property of a location
>> contraint, for instance).
>> 
>> Setting notifications universally for all nodes may be desired
>> in some scenarios, but may not be optimal if nodes may diverge,
> 
> Correct always wins over optimal.
> 
> I’d not be optimising around scripts that only apply to specific
> resources that also don’t run everywhere - at most you waste a few
> cycles.  If that ever becomes a real issue we can add a filter to
> the notify block.
> 
> Far worse is if a service can run somewhere new and you forgot to
> copy the script across… The knowledge doesn’t exist to report that
> as a problem.
> 
> The common scenario will be feeding fencing events into things like
> galera or nova and sending via different transports, like SNMP, SMS,
> email.  Particularly sending SNMP alerts into a fully fledged
> monitoring and alerts system that finds duplicates and does advanced
> filtering.  We do not and should not be trying to reimplement that.
> 
>> or will for sure:
>> 
>> (1) the script may not be distributed across all the nodes
> 
> Thats a bug, not a feature.

see bellow

>>    - or (1b) it is located at the shared storage that will become
>>      available later during cluster life cycle because it is
>>      a subject of cluster service management as well
> 
> How will that script send a notification that the shared storage is
> no longer available?

This was mostly based on (made up, yes) assumption that notification
script is only checked once for the existence.  On the other hand,
if not, periodic recheck won't be drastically different in complexity
from period dir rescan (and optimizations on some systems do exist).

>> (2) one intentionally wants to run the notification mechanism
>>    on a subset of nodes
> 
> Can you explain to me when that would be a good idea?

I have no idea about nifty details about how it all should work, but
it may be desired to, e.g., decide if the notification agent should
run also in pacemaker_remote case or not.  Or you want to run backup
SMS notifications only at the nodes with GSM module installed.

> Particularly when those nodes are the only remaining survivors
> (which you can’t know isn’t the case).
> If we don’t care about the services on those nodes, why did we make
> them HA?

You can achieve good enough HA notification mechanism by using more
non-HA notification methods, just as you do with fencing topologies,
or just as HA cluster uses more nodes that are not HA by themselves.

>> Note also that once you have the responsibility to distribute the
>> script on your own, you can use the same distribution mechanism to
>> share your configuration for this script, as an alternative to using
>> "value" attribute in the above proposal
> 
> So instead of using a standard pool of agents and pcs to set a
> value, I get to maintain two sets of files on every node in the
> cluster?
> And this is supposed to be a feature?

Just wanted to remind that CIB solves just a subset of orchestration
problems.  Tools like pcs adds only a tiny fraction to this subset.

Standard pool of agents + (mostly) single value customization via
central place (CIB) sounds good, not discounting this at all.

>> (and again, this way, you
>> are free to have an asymmetric configuration).  There are tons
>> of cases like that and one has to deal with that already (some RAs,
>> file with secret for Corosync, ...).
>> 
>> What I am up to is a proposal of an alternative/parallel mechanism
>> that better fits the asymmetric (and asynchronous from cluster life
>> cycle POV) use cases: old good drop-in files.  There would simply
>> be a dedicated directory (say /usr/share/pacemaker/notify.d) where
>> the software interested in notifications would craft it's own
>> listener script (or a symlink thereof), script is then discovered
>> by Pacemaker upon subsequent dir rescan or inotify event, done.
>> 
>> --> no configuration needed (or is external to the CIB, or is
>>    interspersed in a non-invasive way there), install and go
>> 
>> --> it has local-only effect, equally as is local the installation
>>    of the respective software utilizing notifications
>>    (and as is local handling of the notifications!)
> 
> Still not a feature.

I am soliciting the feedback to learn more about the usefulness
if you define feature := something useful.

-- 
Jan (Poki)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.clusterlabs.org/pipermail/developers/attachments/20151204/0a08245b/attachment-0002.sig>


More information about the Developers mailing list