[ClusterLabs] pacemaker with sbd fails to start if node reboots too fast.

Tue Dec 5 14:25:46 EST 2017

05.12.2017 12:59, Gao,Yan пишет:
> On 12/04/2017 07:55 PM, Andrei Borzenkov wrote:
>> 04.12.2017 14:48, Gao,Yan пишет:
>>> On 12/02/2017 07:19 PM, Andrei Borzenkov wrote:
>>>> 30.11.2017 13:48, Gao,Yan пишет:
>>>>> On 11/22/2017 08:01 PM, Andrei Borzenkov wrote:
>>>>>> SLES12 SP2 with pacemaker 1.1.15-21.1-e174ec8; two node cluster with
>>>>>> VM on VSphere using shared VMDK as SBD. During basic tests by killing
>>>>>> corosync and forcing STONITH pacemaker was not started after reboot.
>>>>>> In logs I see during boot
>>>>>>
>>>>>> Nov 22 16:04:56 sapprod01s crmd[3151]:     crit: We were allegedly
>>>>>> just fenced by sapprod01p for sapprod01p
>>>>>> Nov 22 16:04:56 sapprod01s pacemakerd[3137]:  warning: The crmd
>>>>>> process (3151) can no longer be respawned,
>>>>>> Nov 22 16:04:56 sapprod01s pacemakerd[3137]:   notice: Shutting down
>>>>>> Pacemaker
>>>>>>
>>>>>> SBD timeouts are 60s for watchdog and 120s for msgwait. It seems that
>>>>>> stonith with SBD always takes msgwait (at least, visually host is not
>>>>>> declared as OFFLINE until 120s passed). But VM rebots lightning fast
>>>>>> and is up and running long before timeout expires.
>>>>>>
>>>>>> I think I have seen similar report already. Is it something that can
>>>>>> be fixed by SBD/pacemaker tuning?
>>>>> SBD_DELAY_START=yes in /etc/sysconfig/sbd is the solution.
>>>>>
>>>>
>>>> I tried it (on openSUSE Tumbleweed which is what I have at hand, it has
>>>> SBD 1.3.0) and with SBD_DELAY_START=yes sbd does not appear to watch
>>>> disk at all.
>>> It simply waits that long on startup before starting the rest of the
>>> cluster stack to make sure the fencing that targeted it has returned. It
>>> intentionally doesn't watch anything during this period of time.
>>>
>>
>> Unfortunately it waits too long.
>>
>> ha1:~ # systemctl status sbd.service
>> ● sbd.service - Shared-storage based fencing daemon
>>     Loaded: loaded (/usr/lib/systemd/system/sbd.service; enabled; vendor
>> preset: disabled)
>>     Active: failed (Result: timeout) since Mon 2017-12-04 21:47:03 MSK;
>> 4min 16s ago
>>    Process: 1861 ExecStop=/usr/bin/kill -TERM $MAINPID (code=exited,
>> status=0/SUCCESS)
>>    Process: 2058 ExecStart=/usr/sbin/sbd $SBD_OPTS -p /var/run/sbd.pid
>> watch (code=killed, signa
>>   Main PID: 1792 (code=exited, status=0/SUCCESS)
>>
>> дек 04 21:45:32 ha1 systemd[1]: Starting Shared-storage based fencing
>> daemon...
>> дек 04 21:47:02 ha1 systemd[1]: sbd.service: Start operation timed out.
>> Terminating.
>> дек 04 21:47:03 ha1 systemd[1]: Failed to start Shared-storage based
>> fencing daemon.
>> дек 04 21:47:03 ha1 systemd[1]: sbd.service: Unit entered failed state.
>> дек 04 21:47:03 ha1 systemd[1]: sbd.service: Failed with result
>> 'timeout'.
>>
>> But the real problem is - in spite of SBD failed to start, the whole
>> cluster stack continues to run; and because SBD blindly trusts in well
>> behaving nodes, fencing appears to succeed after timeout ... without
>> anyone taking any action on poison pill ...
> Start of sbd reaches systemd's timeout for starting units and systemd
> proceeds...
> 

You consider it normal and intended behavior? Again - currently it is
possible that cluster stack starts without having working STONITH and
because there is no confirmation whether stonith via SBD worked at all,
we get into split brain.

> TimeoutStartSec should be configured in sbd.service accordingly to be
> longer than msgwait.
> 

And where is it documented? You did not say it earlier,
/etc/sysconfig/sbd does not say it, "man sbd" does not say it. How
should users be aware about this?