[ClusterLabs] SBD fencing and crashkernel question
Roger Zhou
ZZhou at suse.com
Mon Oct 21 06:09:35 EDT 2019
On 10/20/19 7:03 AM, Strahil Nikolov wrote:
> Hello Community,
>
> I have a question about the stack in newer version compared to our SLES
> 11 openais stack.
> Can someone clarify if a node with SBD will invoke a crashkernel before
> self killing ?
>
> According to my tests on SLES 11 ,when another node kills the
> unresponsive one - crashkernel is invoked and a dump is present at
> /var/crash , but if the node stucks for some reason (naughty admin) -
> there is no sign of a crash (checked on the iLO to be sure).
>
"crashdump" is one of SBD option need be configured on purpose.
You can `man sbd` to check the "-r" option, or "SBD_TIMEOUT_ACTION" in
/etc/sysconfig/sbd
> I'm not sure if this behaviour is the same on newer software version
> (SLES 12/15) and if I can workaround it - as we still struggle to find
> the reason why our clusters fence on a very specific situation (the
> clusters are using MDADM raid1-s on a dual-DC environment instead of SAN
> replication) where remote DC is unavailable for 20-30s until SAN/Network
> is rerouted.
Not sure if you imply cluster-md-raid1 here?
If yes, you might refer to Page 18 of [1](
https://github.com/zzhou1/ks/blob/master/2018-06.%E5%BB%B6%E4%BC%B8Linux%E5%85%B3%E9%94%AE%E4%B8%9A%E5%8A%A1%E5%88%B0%E5%8F%8C%E6%B4%BBNVMe-oF%E5%AD%98%E5%82%A8.OpenInfra18.v8.pdf)
> We have enabled crashdump on some of the systems , but we
> are pending a reboot and then a real DC<->DC connectivity outage to
> gather valuable info,as corosync is using dual-rings and is not
> affected, SBD is using survive on pacemaker and we suspect that the
> nodes suicide.
>
Not able to follow up all your words, you might want to rephrase with a
bit more details.
Cheers,
Roger
> Best Regards,
> Strahil Nikolov
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
More information about the Users
mailing list