[ClusterLabs] Howto stonith in the case of any interface failure?
Kadlecsik József
kadlecsik.jozsef at wigner.mta.hu
Wed Oct 9 14:10:11 EDT 2019
On Wed, 9 Oct 2019, Ken Gaillot wrote:
> > One of the nodes has got a failure ("watchdog: BUG: soft lockup -
> > CPU#7 stuck for 23s"), which resulted that the node could process
> > traffic on the backend interface but not on the fronted one. Thus the
> > services became unavailable but the cluster thought the node is all
> > right and did not stonith it.
> >
> > How could we protect the cluster against such failures?
>
> See the ocf:heartbeat:ethmonitor agent (to monitor the interface itself)
> and/or the ocf:pacemaker:ping agent (to monitor reachability of some IP
> such as a gateway)
This looks really promising, thank you! Does the cluster regard it as a
failure when a ocf:heartbeat:ethmonitor agent clone on a node does not
run? :-)
Best regards,
Jozsef
--
E-mail : kadlecsik.jozsef at wigner.mta.hu
PGP key: http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address: Wigner Research Centre for Physics
H-1525 Budapest 114, POB. 49, Hungary
More information about the Users
mailing list