[ClusterLabs] Antw: SBD fencing and crashkernel question

Mon Oct 21 02:31:43 EDT 2019

>>> Strahil Nikolov <hunter86_bg at yahoo.com> schrieb am 20.10.2019 um 01:03 in
Nachricht <1223585818.2655058.1571526232579 at mail.yahoo.com>:
> Hello Community,
> I have a question about the stack in newer version compared to our SLES 11 
> openais stack.Can someone clarify if a node with SBD will invoke a 
> crashkernel before self killing ?
> According to my tests on SLES 11 ,when another node kills the unresponsive 
> one - crashkernel is invoked and a dump is present at /var/crash , but if the 
> node stucks for some reason (naughty admin) - there is no sign of a crash 
> (checked on the iLO to be sure).
> I'm not sure if this behaviour is the same on newer software version (SLES 
> 12/15) and if I can workaround it - as we still struggle to find the reason 
> why our clusters fence on a very specific situation (the clusters are using 
> MDADM raid1-s on a dual-DC environment instead of SAN replication) where 
> remote DC is unavailable for 20-30s until SAN/Network is rerouted. We have 
> enabled crashdump on some of the systems , but we are pending a reboot and 
> then a real DC<->DC connectivity outage to gather valuable info,as corosync is 
> using dual-rings and is not affected, SBD is using survive on pacemaker and 
> we suspect that the nodes suicide.
> Best Regards,Strahil Nikolov

So basically you want to know why your node is fenced? I couldn't quite  understand the environment you set up, nor what types of problems you are seeing.
Actually in the time of many gigabytes of RAM is see little sense in crash dumps, because they will just consume a lot of time to get done.

Regards,
Ulrich