[ClusterLabs] monitor timed out with unknown error

Arkadiy Kulev eth at ethaniel.com
Sun May 5 09:14:30 EDT 2019


I run pacemaker on 2 active/active hosts which balance the load of 2 public
IP addresses.
A few days ago we ran a very CPU/network intensive process on one of the 2
hosts and Pacemaker failed.

I've attached a screenshot of the terminal to this email.

The "Failed Actions" shows that the IPaddr2 "monitor_30000" failed with
"unknown error" and a status of "Timed Out" (queue=0ms exec=0ms). The
/etc/init.d LSB script (mycluster) failed as well (and set to blocked).

This completely stalled Pacemaker and the second host didn't take over the
IP address and gateway settings.

Any ideas would be appreciated.

[image: Screen Shot 2019-04-30 at 12.36.34.png]


eth at ethaniel.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20190505/ba0b7425/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screen Shot 2019-04-30 at 12.36.34.png
Type: image/png
Size: 422493 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20190505/ba0b7425/attachment-0001.png>

More information about the Users mailing list