[ClusterLabs] Howto stonith in the case of any interface failure?

Kadlecsik József kadlecsik.jozsef at wigner.mta.hu
Wed Oct 9 13:37:53 EDT 2019


On Wed, 9 Oct 2019, Jan Pokorný wrote:

> On 09/10/19 09:58 +0200, Kadlecsik József wrote:
> > The nodes in our cluster have got backend and frontend interfaces: the 
> > former ones are for the storage and cluster (corosync) traffic and the 
> > latter ones are for the public services of KVM guests only.
> > 
> > One of the nodes has got a failure ("watchdog: BUG: soft lockup - CPU#7 
> > stuck for 23s"), which resulted that the node could process traffic on the 
> > backend interface but not on the fronted one. Thus the services became 
> > unavailable but the cluster thought the node is all right and did not 
> > stonith it. 
> > Which is the best way to solve the problem? 
> Looks like heuristics of corosync-qdevice that would ping/attest your
> frontend interface could be a way to go.  You'd need an additional
> host in your setup, though.

As far as I see, corosync-qdevice can add/increase the votes for a node 
and cannot decrease it. I hope I'm wrong, I wouldn't mind adding an 
additional host :-)

Best regards,
E-mail : kadlecsik.jozsef at wigner.mta.hu
PGP key: http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address: Wigner Research Centre for Physics
         H-1525 Budapest 114, POB. 49, Hungary

More information about the Users mailing list