[ClusterLabs] SBD fencing and crashkernel question
ZZhou at suse.com
Mon Oct 21 06:09:35 EDT 2019
On 10/20/19 7:03 AM, Strahil Nikolov wrote:
> Hello Community,
> I have a question about the stack in newer version compared to our SLES
> 11 openais stack.
> Can someone clarify if a node with SBD will invoke a crashkernel before
> self killing ?
> According to my tests on SLES 11 ,when another node kills the
> unresponsive one - crashkernel is invoked and a dump is present at
> /var/crash , but if the node stucks for some reason (naughty admin) -
> there is no sign of a crash (checked on the iLO to be sure).
"crashdump" is one of SBD option need be configured on purpose.
You can `man sbd` to check the "-r" option, or "SBD_TIMEOUT_ACTION" in
> I'm not sure if this behaviour is the same on newer software version
> (SLES 12/15) and if I can workaround it - as we still struggle to find
> the reason why our clusters fence on a very specific situation (the
> clusters are using MDADM raid1-s on a dual-DC environment instead of SAN
> replication) where remote DC is unavailable for 20-30s until SAN/Network
> is rerouted.
Not sure if you imply cluster-md-raid1 here?
If yes, you might refer to Page 18 of (
> We have enabled crashdump on some of the systems , but we
> are pending a reboot and then a real DC<->DC connectivity outage to
> gather valuable info,as corosync is using dual-rings and is not
> affected, SBD is using survive on pacemaker and we suspect that the
> nodes suicide.
Not able to follow up all your words, you might want to rephrase with a
bit more details.
> Best Regards,
> Strahil Nikolov
> Manage your subscription:
> ClusterLabs home: https://www.clusterlabs.org/
More information about the Users