[ClusterLabs] Howto stonith in the case of any interface failure?
Kadlecsik József
kadlecsik.jozsef at wigner.mta.hu
Wed Oct 9 13:37:53 EDT 2019
Hi,
On Wed, 9 Oct 2019, Jan Pokorný wrote:
> On 09/10/19 09:58 +0200, Kadlecsik József wrote:
> > The nodes in our cluster have got backend and frontend interfaces: the
> > former ones are for the storage and cluster (corosync) traffic and the
> > latter ones are for the public services of KVM guests only.
> >
> > One of the nodes has got a failure ("watchdog: BUG: soft lockup - CPU#7
> > stuck for 23s"), which resulted that the node could process traffic on the
> > backend interface but not on the fronted one. Thus the services became
> > unavailable but the cluster thought the node is all right and did not
> > stonith it.
>
> > Which is the best way to solve the problem?
>
> Looks like heuristics of corosync-qdevice that would ping/attest your
> frontend interface could be a way to go. You'd need an additional
> host in your setup, though.
As far as I see, corosync-qdevice can add/increase the votes for a node
and cannot decrease it. I hope I'm wrong, I wouldn't mind adding an
additional host :-)
Best regards,
Jozsef
--
E-mail : kadlecsik.jozsef at wigner.mta.hu
PGP key: http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address: Wigner Research Centre for Physics
H-1525 Budapest 114, POB. 49, Hungary
More information about the Users
mailing list