[ClusterLabs] Antw: Re: Antw: Re: pacemaker with sbd fails to start if node reboots too fast.
Gao,Yan
ygao at suse.com
Tue Dec 5 09:04:29 EST 2017
On 12/05/2017 12:41 PM, Ulrich Windl wrote:
>
>
>>>> "Gao,Yan" <ygao at suse.com> schrieb am 01.12.2017 um 20:36 in Nachricht
> <e49f3c0a-6981-3ab4-a0b0-1e5f49f34a25 at suse.com>:
>> On 11/30/2017 06:48 PM, Andrei Borzenkov wrote:
>>> 30.11.2017 16:11, Klaus Wenninger пишет:
>>>> On 11/30/2017 01:41 PM, Ulrich Windl wrote:
>>>>>
>>>>>>>> "Gao,Yan" <ygao at suse.com> schrieb am 30.11.2017 um 11:48 in Nachricht
>>>>> <e71afccc-06e3-97dd-c66a-1b4bac550c23 at suse.com>:
>>>>>> On 11/22/2017 08:01 PM, Andrei Borzenkov wrote:
>>>>>>> SLES12 SP2 with pacemaker 1.1.15-21.1-e174ec8; two node cluster with
>>>>>>> VM on VSphere using shared VMDK as SBD. During basic tests by killing
>>>>>>> corosync and forcing STONITH pacemaker was not started after reboot.
>>>>>>> In logs I see during boot
>>>>>>>
>>>>>>> Nov 22 16:04:56 sapprod01s crmd[3151]: crit: We were allegedly
>>>>>>> just fenced by sapprod01p for sapprod01p
>>>>>>> Nov 22 16:04:56 sapprod01s pacemakerd[3137]: warning: The crmd
>>>>>>> process (3151) can no longer be respawned,
>>>>>>> Nov 22 16:04:56 sapprod01s pacemakerd[3137]: notice: Shutting down
>>>>>> Pacemaker
>>>>>>> SBD timeouts are 60s for watchdog and 120s for msgwait. It seems that
>>>>>>> stonith with SBD always takes msgwait (at least, visually host is not
>>>>>>> declared as OFFLINE until 120s passed). But VM rebots lightning fast
>>>>>>> and is up and running long before timeout expires.
>>>>> As msgwait was intended for the message to arrive, and not for the reboot
>
>> time (I guess), this just shows a fundamental problem in SBD design: Receipt
>
>> of the fencing command is not confirmed (other than by seeing the
>> consequences of ist execution).
>>>>
>>>> The 2 x msgwait is not for confirmations but for writing the poison-pill
>>>> and for
>>>> having it read by the target-side.
>>>
>>> Yes, of course, but that's not what Urlich likely intended to say.
>>> msgwait must account for worst case storage path latency, while in
>>> normal cases it happens much faster. If fenced node could acknowledge
>>> having been killed after reboot, stonith agent could return success much
>>> earlier.
>> How could an alive man be sure he died before? ;)
>
> I meant: There are three delays:
> 1) The delay until data is on the disk
It takes several IOs for the sender to do this -- read the device
header, lookup the slot, write the message and verify the message is
written (-- A timeout_io defaults to 3s).
As mentioned, msgwait timer of the sender starts only after message has
been verified to be written. We just need to make sure stonith-timeout
is configured longer enough than the sum.
> 2) Delay until date is read from the disk
It's already taken into account with msgwait. Considering the recipient
keeps reading in a loop, we don't know when exactly it starts to read
for this specific message. But once it starts a reading, it has to be
done within timeout_watchdog, otherwise watchdog triggers. So even for a
bad case, the message should be read within 2* timemout_watchdog. That's
the reason why the sender has to wait msgwait, which is 2 *
timeout_watchdog.
> 3) Delay until Host was killed
Kill is basically immediately triggered once poison pill is read.
> A confirmation before 3) could shorten the total wait that includes 2) and 3),
> right?
As mentioned in another email, an alive node, even indeed coming back
from death, cannot actually confirm itself or even give a confirmation
about if it was ever dead. And a successful fencing means the node being
dead.
Regards,
Yan
>
> Regards,
> Ulrich
>
>
>>
>> Regards,
>> Yan
>>
>>>
>>> _______________________________________________
>>> Users mailing list: Users at clusterlabs.org
>>> http://lists.clusterlabs.org/mailman/listinfo/users
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>>
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> http://lists.clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
More information about the Users
mailing list