[ClusterLabs] Antw: Re: pacemaker with sbd fails to start if node reboots too fast.

Sat Dec 2 02:30:59 EST 2017

01.12.2017 22:36, Gao,Yan пишет:
> On 11/30/2017 06:48 PM, Andrei Borzenkov wrote:
>> 30.11.2017 16:11, Klaus Wenninger пишет:
>>> On 11/30/2017 01:41 PM, Ulrich Windl wrote:
>>>>
>>>>>>> "Gao,Yan" <ygao at suse.com> schrieb am 30.11.2017 um 11:48 in
>>>>>>> Nachricht
>>>> <e71afccc-06e3-97dd-c66a-1b4bac550c23 at suse.com>:
>>>>> On 11/22/2017 08:01 PM, Andrei Borzenkov wrote:
>>>>>> SLES12 SP2 with pacemaker 1.1.15-21.1-e174ec8; two node cluster with
>>>>>> VM on VSphere using shared VMDK as SBD. During basic tests by killing
>>>>>> corosync and forcing STONITH pacemaker was not started after reboot.
>>>>>> In logs I see during boot
>>>>>>
>>>>>> Nov 22 16:04:56 sapprod01s crmd[3151]:     crit: We were allegedly
>>>>>> just fenced by sapprod01p for sapprod01p
>>>>>> Nov 22 16:04:56 sapprod01s pacemakerd[3137]:  warning: The crmd
>>>>>> process (3151) can no longer be respawned,
>>>>>> Nov 22 16:04:56 sapprod01s pacemakerd[3137]:   notice: Shutting down
>>>>> Pacemaker
>>>>>> SBD timeouts are 60s for watchdog and 120s for msgwait. It seems that
>>>>>> stonith with SBD always takes msgwait (at least, visually host is not
>>>>>> declared as OFFLINE until 120s passed). But VM rebots lightning fast
>>>>>> and is up and running long before timeout expires.
>>>> As msgwait was intended for the message to arrive, and not for the
>>>> reboot time (I guess), this just shows a fundamental problem in SBD
>>>> design: Receipt of the fencing command is not confirmed (other than
>>>> by seeing the consequences of ist execution).
>>>
>>> The 2 x msgwait is not for confirmations but for writing the poison-pill
>>> and for
>>> having it read by the target-side.
>>
>> Yes, of course, but that's not what Urlich likely intended to say.
>> msgwait must account for worst case storage path latency, while in
>> normal cases it happens much faster. If fenced node could acknowledge
>> having been killed after reboot, stonith agent could return success much
>> earlier.
> How could an alive man be sure he died before? ;)
> 

It does not need to. It simply needs to write something on startup to
indicate it is back.

Actually, fenced side already does it - it clears pending message when
sbd is started. It is fencing side that simply unconditionally sleeps
for msgwait:

        if (mbox_write_verify(st, mbox, s_mbox) < -1) {
                rc = -1; goto out;
        }
        if (strcasecmp(cmd, "exit") != 0) {
                cl_log(LOG_INFO, "Messaging delay: %d",
                                (int)timeout_msgwait);
                sleep(timeout_msgwait);
        }

What if we do not sleep but rather periodically check slot for
acknowledgement for msgwait timeout? Then we could return earlier.