[ClusterLabs] Howto stonith in the case of any interface failure?
    Kadlecsik József 
    kadlecsik.jozsef at wigner.mta.hu
       
    Wed Oct  9 14:10:11 EDT 2019
    
    
  
On Wed, 9 Oct 2019, Ken Gaillot wrote:
> > One of the nodes has got a failure ("watchdog: BUG: soft lockup - 
> > CPU#7 stuck for 23s"), which resulted that the node could process 
> > traffic on the backend interface but not on the fronted one. Thus the 
> > services became unavailable but the cluster thought the node is all 
> > right and did not stonith it.
> > 
> > How could we protect the cluster against such failures?
> 
> See the ocf:heartbeat:ethmonitor agent (to monitor the interface itself) 
> and/or the ocf:pacemaker:ping agent (to monitor reachability of some IP 
> such as a gateway)
This looks really promising, thank you! Does the cluster regard it as a 
failure when a ocf:heartbeat:ethmonitor agent clone on a node does not 
run? :-)
Best regards,
Jozsef
--
E-mail : kadlecsik.jozsef at wigner.mta.hu
PGP key: http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address: Wigner Research Centre for Physics
         H-1525 Budapest 114, POB. 49, Hungary
    
    
More information about the Users
mailing list