[ClusterLabs Developers] pacemaker POC to execute external program in case of RA timeout

Klaus Wenninger kwenning at redhat.com
Mon May 31 11:12:12 UTC 2021


On 5/31/21 10:53 AM, Emil Penchev wrote:
> Hi all,
>
> I'm writing about an issue we have received from a pacemaker user 
> about RA timeout.
> Some users have encountered a timeout from RA script/program and this 
> led to a major outage for them.
> Typical of these types of cases, there is no additional useful 
> information to explain why this happened.
> There is a proposed solution, a POC from the user to instrument 
> pacemaker directly and insert a method to activate further debugging 
> via an external callout program.
> One can set an environment variable, for example*PCMK_timeout_prog* 
> that points to an external program or a script to be executed to get 
> more useful debug information for example.
>
> Here is the proposed POC change with minor changes.
> https://github.com/tickbg/pacemaker/compare/master...a453d30 
> <https://github.com/tickbg/pacemaker/compare/master...a453d30>
>
If you directly create a pull-request we would be able

to use github for discussion.


In pacemaker we already have the alerts-feature that

allows calling scripts on various occasions.

One of those is resource-actions.

So it might make sense to consider an extension of

that feature as to cover your case here as well.

Atm you would get the return-code of the RA passed

to your script. I'm actually unsure what happens in

case of a timeout.

To just be called in case of a timeout additional

filtering might be handy to reduce load generated

if the filtering is done in the script and a synchronous-call

flag (atm alerts are called more in a fire and forget

manner as not to throttle pacemaker actions)

could be useful.


Klaus

>
> Emil.
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/developers
>
> ClusterLabs home: https://www.clusterlabs.org/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/developers/attachments/20210531/70800396/attachment-0002.htm>


More information about the Developers mailing list