[ClusterLabs] SBD fencing and crashkernel question

Strahil Nikolov hunter86_bg at yahoo.com
Sat Oct 19 19:03:52 EDT 2019


Hello Community,
I have a question about the stack in newer version compared to our SLES 11 openais stack.Can someone clarify if a node with SBD will invoke a crashkernel before self killing ?
According to my tests on SLES 11 ,when another node kills the unresponsive one - crashkernel is invoked and a dump is present at /var/crash , but if the node stucks for some reason (naughty admin) - there is no sign of a crash (checked on the iLO to be sure).
I'm not sure if this behaviour is the same on newer software version (SLES 12/15) and if I can workaround it - as we still struggle to find the reason why our clusters fence on a very specific situation (the clusters are using MDADM raid1-s on a dual-DC environment instead of SAN replication) where remote DC is unavailable for 20-30s until SAN/Network is rerouted. We have enabled crashdump on some of the systems , but we are pending a reboot and then a real DC<->DC connectivity outage to gather valuable info,as corosync is using dual-rings and is not affected, SBD is using survive on pacemaker and we suspect that the nodes suicide.
Best Regards,Strahil Nikolov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20191019/286ae9db/attachment.html>


More information about the Users mailing list