[ClusterLabs] monitor timed out with unknown error
arvidjaar at gmail.com
Sun May 5 12:05:53 EDT 2019
05.05.2019 18:43, Arkadiy Kulev пишет:
> Dear Andrei,
> I'm sorry for the screenshot, this is the only thing that I have left after
> the crash.
What crash do you mean? All nodes appear up and running, you are able to
execute commands, I do not see anything crashed.
> What would the best course of action be in this situation?
Configure STONITH. It is mandatory so pacemaker can resolve such
situation among others.
For now assuming node problems are over you should be able to clean
resource state (crm_resource --cleanup). Restarting pacemaker on all
nodes would also work.
> We don't have a STONITH device. But the local network is still up (both
> nodes see each othes).
> Also, what does "(blocked)" means?
It means that pacemaker cannot perform any action on this resource due
to failed prerequisites. In this case failed prerequisite was successful
stop of resource.
> eth at ethaniel.com
> On Sun, May 5, 2019 at 9:46 PM Andrei Borzenkov <arvidjaar at gmail.com> wrote:
>> 05.05.2019 16:14, Arkadiy Kulev пишет:
>>> I run pacemaker on 2 active/active hosts which balance the load of 2
>>> IP addresses.
>>> A few days ago we ran a very CPU/network intensive process on one of the
>>> hosts and Pacemaker failed.
>>> I've attached a screenshot of the terminal to this email.
>>> The "Failed Actions" shows that the IPaddr2 "monitor_30000" failed with
>>> "unknown error" and a status of "Timed Out" (queue=0ms exec=0ms). The
>>> /etc/init.d LSB script (mycluster) failed as well (and set to blocked).
>>> This completely stalled Pacemaker and the second host didn't take over
>>> IP address and gateway settings.
>>> Any ideas would be appreciated.
>> Stop operation failed, you have no stonith, so pacemaker cannot continue
>> and is stuck.
>>> [image: Screen Shot 2019-04-30 at 12.36.34.png]
>> Images are hard to reply to, consume excessive space and cannot be
>> viewed using text only clients. There is no reason to send image when
>> you can just copy and paste several lines of text.
>> Manage your subscription:
>> ClusterLabs home: https://www.clusterlabs.org/
> Manage your subscription:
> ClusterLabs home: https://www.clusterlabs.org/
More information about the Users