[ClusterLabs] Antw: SDB msgwait & partner reboot time

Wed Sep 9 02:26:05 EDT 2015

>>> Jorge Fábregas <jorge.fabregas at gmail.com> schrieb am 08.09.2015 um 17:45
in
Nachricht <55EF029C.3000900 at gmail.com>:
> Hi,
> 
> I've read about how important is the relationship between the different
> parameters of the SBD device (msgwait & watchdog timeout) & Pacemaker's
> stonith timeout.  However I've just encountered something that I never
> considered:  the time elapsed until a node is fully up (after being
> fenced) against msgwait.
> 
> Two nodes: sles11a & sles11b.  I fenced sles11a (via Hawk's interface
> that triggers the sbd resource agent) and watched carefully
> /var/log/messages on sles11b:
> 
> 
> Sept 8 11:27:00 sles11b  sbd: Writing reset to node slot sles11a
> Sept 8 11:27:00 sles11b  sbd: Messaging delay: 40
> 
> [sles11a is rebooting and it comes up in about 12 seconds]

Lucky you (for the fast reboot time), but you have a problem:
1) the msgwait has to be long enough to make (as close as possible to) 100%
sure that the node is down when the time has expired. Then the cluster will
perform recovery operationms for the down node. If the node is up earlier and
joined the cluster, things way be in somewhat disorder.
2) The msgwait has to be long enough to make sure the SBD commands are
delivered even if a disk needs some retries, or your storage system is slow
while being online (this could mean you do an "online" firmware upgrade where
the system won't respond for a few seconds).

May guess  woule be to increase the node boot time and to decreate the msgwait
to somethink like 30 seconds.

Usually you have SCSI timeouts around one minute. Also remember that parts of
the OS will retry I/O for some time before flagging an error to the
application.

> 
> [see a bunch of messages joining the cluster]
> 
> [finally node sles11a is online at about 11:27:25]
> 
> Sept 8 11:27:40 sles11b sbd: Message successfully delivered
> 
> [sles11a is put offline!]
> 
> Sept 8 11:27:41 pengine[4358]: warning: custom_action: Action
> p_stonith-sdb_monitor_0 on sles11a
>  is unrunnable (pending)

This is when the node is up and online, but fencing still isn't confirmed?

> 
> I've done it about 5 times and it happens every time.
> 
> My values are: 20 (watchdog timeout) & 40 (msgwait).  I know I
> know..it's too much for my lab environment but I'm just curious if
> there's something wrong or if indeed msgwait NEEDS to be ALWAYS less
> than reboot-time.

If you want to have an exciting configuration, you could try to get watchdog
timeout down to 5 seconds or so, and shorten the msgwait (and possibly other
dependign parameters). But make sure support accepts such short values.

BTW: We have a msgwait close to 3 minutes, allowing the storage to be not
responding for up to 60 seconds. The difference is a safety margin for possible
retries... Our physical hosts hardly boot in less than 4 minutes.

Regards,
Ulrich